Skip to main content Link Menu Expand (external link) Document Search Copy Copied

DSC 100: Database Management Practice Problems

Table of Contents

  1. Relational Data Model and Design
  2. SQL Fundamentals
  3. Joins and Advanced Queries
  4. Indexes and Query Optimization
  5. Normalization and Dependencies
  6. ER Diagrams
Each problem is marked with difficulty: 🟢 Easy 🟡 Medium 🟣 Hard

Relational Data Model

Problem 1 🟢 (Multiple Choice)

Which statement about relations is TRUE?

a) Relations must have at least one tuple

b) The order of tuples in a relation matters

c) All attributes must be of different types

d) A relation is a set of tuples with the same attributes

Solution Answer: d) A relation is a set of tuples with the same attributes
Relations are defined as sets of tuples, where:
- Order doesn't matter (rules out b)
- Can be empty (rules out a)
- Attributes can be of same type (rules out c)

Problem 2 🟡 (Multiple Choice)

Given a relation R(A,B,C,D) with functional dependencies A→B and B→C, what is TRUE?

a) A is a superkey

b) {A,D} is not a superkey

c) B must be a primary key

d) A→C by transitivity

Solution Answer: d) A→C by transitivity
Because:
- A→B and B→C implies A→C (transitivity)
- A is not necessarily a superkey as it doesn't determine D
- {A,D} could be a superkey
- B is not necessarily a key as it doesn't determine A or D

Problem 3 🟢

Given the following relation:

Student(id, name, major, advisor_id, department)

What is the difference between a candidate key and a superkey in this context? Note: id is considered a business key (auto incrementing id)

Solution A superkey is any set of attributes that can uniquely identify a tuple, while a candidate key is a minimal superkey (removing any attribute would make it no longer unique). For instance, some of the super and/or candidate keys in this case:
- {id} is both a candidate key and a superkey
- {id, name} is only a superkey (not minimal)
- {id, major} is only a superkey (not minimal)
- {name} is neither (might not be unique)

Problem 4 🟡

Given these functional dependencies:

A → B
BC → D
D → E

Find all the attributes that are transitively dependent on A.

Solution Let's analyze step by step:
First, identify candidate keys:
A is not a candidate key (only determines B)
BC might be a candidate key (determines D, which determines E)
We need BC to determine all attributes
Look for transitive dependencies:
Direct: A → B
Direct: BC → D
Direct: D → E
Transitive: BC → D → E
Analyze if the transitive dependency BC → D → E is problematic:
BC is likely a candidate key (can determine all attributes)
Therefore, this transitive dependency is NOT problematic because the first determinant (BC) is a full candidate key
No problematic transitive dependencies exist because:
The only transitive dependency involves BC, which is a candidate key
If D alone were a determinant in another FD, that would be problematic since D is not a candidate key
Remember: A transitive dependency is problematic when:
The middle determinant is not part of a candidate key i.e. a prime attribute
The middle determinant is either:
a) A subset of a candidate key, or
b) Contains non-prime attributes

Problem 5 🟢 (Multiple Choice)

Which of the following is NOT an atomic data type in SQL?

a) INTEGER

b) VARCHAR

c) ARRAY

d) DATE

Solution Answer: c) ARRAY
Arrays are not atomic because:
- They can be decomposed into multiple values
- Atomic types must be single, indivisible values
- Common atomic types include INTEGER, VARCHAR, DATE, etc.

Problem 6 🟡

Given this table structure:

CREATE TABLE Orders (
    order_id INTEGER PRIMARY KEY,
    customer_id INTEGER,
    order_date DATE,
    total_amount DECIMAL(10,2),
    status VARCHAR(20)
);

Write all the domain constraints that are implicitly enforced by this schema.

Solution The domain constraints are:
1. order_id must be an integer
2. customer_id must be an integer
3. order_date must be a valid date
4. total_amount must be a decimal number with up to 10 digits total and 2 decimal places
5. status must be a variable-length string up to 20 characters
6. order_id cannot be NULL (due to PRIMARY KEY constraint)

Problem 7 🟣

Consider the following query:

SELECT department, COUNT(*) as emp_count
FROM Employee
GROUP BY department
HAVING COUNT(*) > (
    SELECT AVG(dept_count)
    FROM (
        SELECT COUNT(*) as dept_count
        FROM Employee
        GROUP BY department
    ) as dept_sizes
);

Explain step by step how this query is evaluated by the database engine.

Solution Query evaluation order:
1. Innermost subquery:
- Groups employees by department
- Counts employees in each department
2. Middle subquery:
- Takes those counts
- Calculates average department size
3. Main query:
- Groups employees by department again
- Counts employees in each department
- Compares each count to the average
- Returns only departments with above-average counts
The query finds departments that have more than the average number of employees.

Problem 8 🟡 (Multiple Choice)

Which statement about NULL values in SQL is correct?

a) NULL = NULL evaluates to TRUE

b) NULL values are considered equal to each other

c) NULL values are ignored in COUNT(*)

d) NULL values are excluded from COUNT(column_name)

Solution Answer: d) NULL values are excluded from COUNT(column_name)
Because:
- NULL = NULL evaluates to NULL, not TRUE
- NULL values are not considered equal to each other
- COUNT(*) includes rows with NULL values
- COUNT(column_name) excludes NULL values in that column

Problem 9 🟡

Write a query using window functions to rank employees by salary within each department. Display the following:

  • Employee name
  • Department
  • Salary
  • Rank within department
Solution
SELECT 
    name,
    department,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dept_rank
FROM Employee;

Joins and Advanced Queries

Problem 10 🟣

Given tables:

Movies(movie_id, title, release_year)
Actors(actor_id, name)
Roles(movie_id, actor_id, character_name)

Write a query to find all pairs of actors who have appeared in at least 3 movies together.

Solution
WITH ActorPairs AS (
    SELECT 
        r1.actor_id as actor1_id,
        r2.actor_id as actor2_id,
        COUNT(DISTINCT r1.movie_id) as movies_together
    FROM Roles r1
    JOIN Roles r2 ON r1.movie_id = r2.movie_id
    WHERE r1.actor_id < r2.actor_id
    GROUP BY r1.actor_id, r2.actor_id
    HAVING COUNT(DISTINCT r1.movie_id) >= 3
)
SELECT 
    a1.name as actor1_name,
    a2.name as actor2_name,
    ap.movies_together
FROM ActorPairs ap
JOIN Actors a1 ON ap.actor1_id = a1.actor_id
JOIN Actors a2 ON ap.actor2_id = a2.actor_id;

Indexes and Optimization

Problem 11 🟡 (Multiple Choice)

When is a non-clustered index LEAST useful?

a) When querying a small percentage of rows

b) When the table is frequently updated

c) When performing range queries

d) When doing full table scans

Solution Answer: d) When doing full table scans
Because:
- Full table scans need to read all data anyway
- Index would add overhead without benefit
- Other cases can benefit from index usage

Normalization

Problem 12 🟣

Given the relation:

StudentCourse(student_id, student_name, course_id, course_name, instructor_id, instructor_name, department)

With functional dependencies:

  • student_id → student_name
  • course_id → course_name, instructor_id
  • instructor_id → instructor_name, department

Decompose into BCNF. Show work step by step.

Solution 1. **Identify all violations:**
- No partial dependencies are present, which is desirable.
- However, there are multiple transitive dependencies that need addressing.
2. **Decomposition into BCNF:**
- Decompose into the following relations:
     Student(student_id, student_name)
     Course(course_id, course_name, instructor_id)
     Instructor(instructor_id, instructor_name, department)
     Enrollment(student_id, course_id)
    
This achieves BCNF because:
- In each relation, all determinants are candidate keys.
- Transitive dependencies have been eliminated.
- All original functional dependencies are preserved.

ER Diagrams

Problem 13 🟣

Design an ER diagram using Mermaid syntax for a Library Management System with these requirements:

  • Library has multiple branches
  • Each branch has books and employees
  • Books can exist in multiple copies
  • Members can borrow books
  • Members can have fines
  • Employees can be librarians or managers
Solution
erDiagram

    Library ||--|{ Branch : has
    Branch ||--|{ Book : contains
    Branch ||--|{ Employee : employs
    Book ||--|{ Copy : consists-of
    Member ||--|{ Borrow : borrows
    Borrow ||--|| Copy : of
    Member ||--|{ Fine : has

    Library {
        string library_id PK
        string name
    }

    Branch {
        string branch_id PK
        string address
        string library_id FK
    }

    Book {
        string book_id PK
        string title
        string author
    }

    Copy {
        string copy_id PK
        string book_id FK
        string branch_id FK
        string status
    }

    Member {
        string member_id PK
        string name
        string address
    }

    Borrow {
        string member_id FK
        string copy_id FK
        datetime borrow_date
        datetime return_date
    }

    Fine {
        string fine_id PK
        string member_id FK
        float amount
        datetime date
    }

    Employee {
        string employee_id PK
        string branch_id FK
        string name
        string role
        string title "takes either librarian or manager"
    }

Problem 14 🟡 (Multiple Choice)

In an ER diagram, what does a double-line (thick line) relationship indicate?

a) Many-to-many relationship

b) Total participation

c) Weak entity relationship

d) Inheritance relationship

Solution Answer: b) Total participation
Let's break down the key visual elements in ER diagrams:
Double-line (thick line) in relationship:
Indicates total participation
Means every entity in the participating entity set MUST be involved in the relationship
Also known as mandatory participation
Contrasts with partial participation (single line) where participation is optional
Double rectangle (double outline) around entity:
Indicates a weak entity
Different from the double-line relationship!
Weak entities cannot exist without their identifying owner (strong entity)
Example: A room number only makes sense within a specific building
Strong vs Weak Entities:
Strong entities:
Have their own primary key
Can exist independently
Drawn with single rectangle
Example: A Building has its own unique building_id
Weak entities:
Depend on strong entity for identification
Need parent entity's key as part of their identifier
Drawn with double rectangle
Example: Room depends on Building, identified by (building_id, room_number)
Visual Summary:
Single line ──── : Partial participation
Double line ════ : Total participation
Single rectangle └─┐ : Strong entity
Double rectangle └═┐ : Weak entity

Common mistake: Don't confuse double-line relationships (total participation) with double-rectangle entities (weak entities)!