Docs
SQL Basic

SQL Basic

Learn how to use the basic SQL statement to query data from a database.

SELECT

Understanding the SELECT Statement

SELECT is the primary SQL statement for retrieving data from one or more database tables. Every SELECT statement requires two essential components:

  1. What to select (columns or expressions)
  2. Where to select from (table or view name)

Basic SELECT Syntax:

SELECT column_name(s)
FROM table_name;

Key Concepts:

  • Keywords: Reserved SQL words like SELECT, FROM, WHERE
  • Case insensitive: SELECT = select = Select
  • Whitespace flexible: Can span multiple lines
  • Semicolon termination: Required for multiple statements

Retrieving Individual Columns

Single Column Selection:

-- Retrieve product names only
SELECT prod_name
FROM Products;

Output:

prod_name
--------------------
Fish bean bag toy
Bird bean bag toy
Rabbit bean bag toy
8 inch teddy bear
12 inch teddy bear
18 inch teddy bear
Raggedy Ann
King doll
Queen doll

Important Notes:

  • Unsorted by default: Results appear in no guaranteed order
  • All rows returned: No filtering unless specified
  • Raw data: No formatting applied by SQL

Best Practices for Single Columns:

-- ✅ Good: Clear, readable formatting
SELECT prod_name
FROM Products;
 
-- ✅ Also acceptable: Single line for simple queries
SELECT prod_name FROM Products;
 
-- ✅ Multi-line for complex queries
SELECT 
    prod_name
FROM 
    Products;

Retrieving Multiple Columns

Multiple Column Selection:

-- Retrieve product ID, name, and price
SELECT prod_id, prod_name, prod_price
FROM Products;

Output:

prod_id      prod_name               prod_price
---------    --------------------    ----------
BNBG01       Fish bean bag toy             3.49
BNBG02       Bird bean bag toy             3.49
BNBG03       Rabbit bean bag toy           3.49
BR01         8 inch teddy bear             5.99
BR02         12 inch teddy bear            8.99
BR03         18 inch teddy bear           11.99
RGAN01       Raggedy Ann                   4.99
RYL01        King doll                     9.49
RYL02        Queen doll                    9.49

Column Selection Rules:

  • Comma separation: Each column separated by comma
  • No trailing comma: Don't add comma after last column
  • Order matters: Columns appear in specified order
  • Column names: Must exist in the table

Advanced Multiple Column Examples:

-- Customer information
SELECT cust_id, cust_name, cust_email, cust_phone
FROM Customers;
 
-- Order details
SELECT order_num, order_date, cust_id, order_total
FROM Orders;
 
-- Product catalog
SELECT prod_id, prod_name, prod_price, vend_id, prod_desc
FROM Products;

Common Mistakes to Avoid:

-- ❌ Missing comma between columns
SELECT prod_id prod_name, prod_price
FROM Products;
 
-- ❌ Trailing comma after last column
SELECT prod_id, prod_name, prod_price,
FROM Products;
 
-- ❌ Non-existent column name
SELECT prod_id, product_name, prod_price
FROM Products;

Retrieving All Columns

Using the Asterisk (*) Wildcard:

-- Retrieve all columns from Products table
SELECT *
FROM Products;

*When to Use SELECT :

✅ Good Use Cases❌ Avoid When
Exploratory queriesProduction applications
Ad-hoc analysisPerformance-critical queries
Unknown table structureSpecific column needs
Quick data inspectionNetwork bandwidth concerns

Practical Examples:

-- ✅ Good: Exploring new table structure
SELECT * FROM NewCustomerTable;
 
-- ✅ Good: Quick data verification
SELECT * FROM Orders WHERE order_num = 20005;
 
-- ❌ Avoid: Production application
-- This retrieves unnecessary data and hurts performance
SELECT * FROM Products WHERE prod_price > 10;
 
-- ✅ Better: Specify needed columns
SELECT prod_id, prod_name, prod_price 
FROM Products 
WHERE prod_price > 10;

Advantages and Disadvantages:

Advantages:

  • Quick exploration: See all available data
  • Unknown columns: Retrieve columns you don't know exist
  • Simple syntax: Easy to write and remember

Disadvantages:

  • Performance impact: Retrieves unnecessary data
  • Network overhead: More data transferred
  • Application brittleness: Breaks if table structure changes
  • Security concerns: May expose sensitive columns

Retrieving Distinct Rows

The Problem: Duplicate Values

-- This returns duplicate vendor IDs
SELECT vend_id
FROM Products;

Output:

vend_id
----------
BRS01
BRS01
BRS01
DLL01
DLL01
DLL01
DLL01
FNG01
FNG01

The Solution: DISTINCT Keyword

-- This returns only unique vendor IDs
SELECT DISTINCT vend_id
FROM Products;

Output:

vend_id
----------
BRS01
DLL01
FNG01

DISTINCT with Multiple Columns:

-- DISTINCT applies to the combination of ALL columns
SELECT DISTINCT vend_id, prod_price
FROM Products;

Important DISTINCT Rules:

  • Applies to all columns: Can't be partially applied
  • Combination uniqueness: Looks at entire row combination
  • Performance impact: Requires sorting/grouping internally
  • Position matters: Must be first keyword after SELECT

Practical DISTINCT Examples:

-- Find all states where customers are located
SELECT DISTINCT cust_state
FROM Customers;
 
-- Find all unique product categories
SELECT DISTINCT prod_category
FROM Products;
 
-- Find unique combinations of state and city
SELECT DISTINCT cust_state, cust_city
FROM Customers
ORDER BY cust_state, cust_city;
 
-- Count unique customers who placed orders
SELECT COUNT(DISTINCT cust_id) AS unique_customers
FROM Orders;

DISTINCT vs GROUP BY:

-- These queries produce similar results
SELECT DISTINCT cust_state FROM Customers;
 
SELECT cust_state FROM Customers GROUP BY cust_state;
 
-- But GROUP BY allows aggregation
SELECT cust_state, COUNT(*) AS customer_count
FROM Customers
GROUP BY cust_state;

Limiting Results

Database-Specific Syntax Variations:

SQL Server - TOP:

-- Get first 5 products
SELECT TOP 5 prod_name
FROM Products;
 
-- Get top 10 percent
SELECT TOP 10 PERCENT prod_name, prod_price
FROM Products
ORDER BY prod_price DESC;
 
-- With ties (include rows with same values)
SELECT TOP 5 WITH TIES prod_name, prod_price
FROM Products
ORDER BY prod_price DESC;

MySQL/PostgreSQL/SQLite - LIMIT:

-- Get first 5 products
SELECT prod_name
FROM Products
LIMIT 5;
 
-- Get 5 products starting from row 6 (pagination)
SELECT prod_name
FROM Products
LIMIT 5 OFFSET 5;
 
-- MySQL shorthand: LIMIT offset, count
SELECT prod_name
FROM Products
LIMIT 5, 5;  -- Skip 5, take 5

Oracle - ROWNUM and FETCH:

-- Oracle 11g and earlier
SELECT prod_name
FROM Products
WHERE ROWNUM <= 5;
 
-- Oracle 12c and later (preferred)
SELECT prod_name
FROM Products
FETCH FIRST 5 ROWS ONLY;
 
-- With offset (Oracle 12c+)
SELECT prod_name
FROM Products
OFFSET 5 ROWS FETCH NEXT 5 ROWS ONLY;

DB2 - FETCH FIRST:

-- Get first 5 rows
SELECT prod_name
FROM Products
FETCH FIRST 5 ROWS ONLY;

Pagination Examples:

-- Page 1: First 10 products
SELECT prod_id, prod_name, prod_price
FROM Products
ORDER BY prod_name
LIMIT 10 OFFSET 0;
 
-- Page 2: Next 10 products  
SELECT prod_id, prod_name, prod_price
FROM Products
ORDER BY prod_name
LIMIT 10 OFFSET 10;
 
-- Page 3: Next 10 products
SELECT prod_id, prod_name, prod_price
FROM Products
ORDER BY prod_name
LIMIT 10 OFFSET 20;

Performance Considerations:

-- ✅ Good: Use with ORDER BY for consistent results
SELECT prod_name, prod_price
FROM Products
ORDER BY prod_price DESC
LIMIT 10;
 
-- ❌ Problematic: Without ORDER BY, results are unpredictable
SELECT prod_name, prod_price
FROM Products
LIMIT 10;
 
-- ✅ Good: Efficient pagination with indexed columns
SELECT cust_id, cust_name
FROM Customers
ORDER BY cust_id  -- Assuming cust_id is indexed
LIMIT 20 OFFSET 100;

Using Comments

Inline Comments with --:

SELECT prod_name,    -- Product name for display
       prod_price,   -- Current selling price
       vend_id       -- Vendor identifier
FROM Products;       -- Main product catalog

Line Comments with #:

# This query retrieves all product information
# for products priced above $10
SELECT prod_name, prod_price
FROM Products
WHERE prod_price > 10;

Block Comments with / /:

/*
 * Product Analysis Query
 * Author: Database Team
 * Date: 2024-01-15
 * Purpose: Retrieve high-value products for marketing analysis
 */
SELECT prod_id, prod_name, prod_price
FROM Products
WHERE prod_price > 15;
 
/* Temporarily disabled - alternative query
SELECT prod_id, prod_name, prod_price, vend_id
FROM Products
WHERE prod_price BETWEEN 10 AND 20;
*/

Comment Best Practices:

-- ✅ Good: Explain complex logic
SELECT 
    prod_name,
    prod_price,
    prod_price * 0.9 AS discounted_price  -- 10% discount for promotion
FROM Products
WHERE prod_price > 10;  -- Only products above minimum threshold
 
-- ✅ Good: Document business rules
/*
 * Customer Status Classification:
 * - Premium: Orders > $1000 total
 * - Standard: Orders $100-$1000
 * - Basic: Orders < $100
 */
SELECT 
    cust_id,
    cust_name,
    CASE 
        WHEN total_orders > 1000 THEN 'Premium'
        WHEN total_orders >= 100 THEN 'Standard'
        ELSE 'Basic'
    END AS customer_status
FROM CustomerSummary;
 
-- ✅ Good: Temporary debugging
SELECT prod_id, prod_name, prod_price
FROM Products
-- WHERE prod_price > 10  -- Temporarily disabled for testing
ORDER BY prod_name;

Advanced SELECT Techniques

Column Aliases:

-- Make output more readable
SELECT 
    prod_id AS "Product ID",
    prod_name AS "Product Name", 
    prod_price AS "Price ($)"
FROM Products;
 
-- Calculate derived values
SELECT 
    prod_name,
    prod_price,
    prod_price * 1.1 AS price_with_tax,
    prod_price * 0.9 AS sale_price
FROM Products;

Expressions and Calculations:

-- Mathematical operations
SELECT 
    order_num,
    quantity,
    item_price,
    quantity * item_price AS line_total
FROM OrderItems;
 
-- String operations
SELECT 
    cust_name,
    cust_city + ', ' + cust_state AS location  -- SQL Server
    -- CONCAT(cust_city, ', ', cust_state) AS location  -- MySQL/PostgreSQL
FROM Customers;
 
-- Date operations
SELECT 
    order_num,
    order_date,
    DATEDIFF(day, order_date, GETDATE()) AS days_since_order  -- SQL Server
FROM Orders;

Conditional Logic:

-- CASE statements for conditional output
SELECT 
    prod_name,
    prod_price,
    CASE 
        WHEN prod_price < 5 THEN 'Budget'
        WHEN prod_price < 15 THEN 'Standard'
        ELSE 'Premium'
    END AS price_category
FROM Products;
 
-- NULL handling
SELECT 
    cust_name,
    ISNULL(cust_email, 'No email provided') AS email_status  -- SQL Server
    -- COALESCE(cust_email, 'No email provided') AS email_status  -- Standard SQL
FROM Customers;

Performance Optimization Tips

Efficient Column Selection:

-- ✅ Good: Select only needed columns
SELECT cust_id, cust_name
FROM Customers
WHERE cust_state = 'CA';
 
-- ❌ Inefficient: Selecting unnecessary data
SELECT *
FROM Customers
WHERE cust_state = 'CA';

Index-Friendly Queries:

-- ✅ Good: Use indexed columns in WHERE clauses
SELECT prod_name, prod_price
FROM Products
WHERE prod_id = 'BR01';  -- Assuming prod_id is indexed
 
-- ✅ Good: Limit results for large tables
SELECT TOP 100 cust_name, cust_email
FROM Customers
ORDER BY cust_name;

Challenge Solutions

Challenge 1: Retrieve all customer IDs

-- Solution: Basic SELECT for single column
SELECT cust_id
FROM Customers;

Challenge 2: Unique products ordered

-- Solution: Use DISTINCT to eliminate duplicates
SELECT DISTINCT prod_id
FROM OrderItems;
 
-- This should return 7 unique product IDs

Challenge 3: All columns vs customer ID with comments

-- Solution: Use comments to switch between queries
 
/* All columns version */
SELECT *
FROM Customers;
 
-- Single column version
-- SELECT cust_id FROM Customers;
 
/* Alternative approach - comment out the other query
SELECT cust_id
FROM Customers;
*/
 
-- All columns version (commented out)
-- SELECT * FROM Customers;

Common SELECT Patterns

Data Exploration:

-- Quick table overview
SELECT * FROM Products LIMIT 5;
 
-- Check data types and sample values
SELECT 
    prod_id,        -- Character data
    prod_name,      -- Variable text
    prod_price,     -- Numeric data
    vend_id         -- Foreign key reference
FROM Products
LIMIT 3;

Data Quality Checks:

-- Find missing data
SELECT COUNT(*) AS total_rows,
       COUNT(cust_email) AS rows_with_email,
       COUNT(*) - COUNT(cust_email) AS missing_emails
FROM Customers;
 
-- Check for duplicates
SELECT cust_email, COUNT(*) AS duplicate_count
FROM Customers
WHERE cust_email IS NOT NULL
GROUP BY cust_email
HAVING COUNT(*) > 1;

Reporting Queries:

-- Summary information
SELECT 
    COUNT(*) AS total_products,
    MIN(prod_price) AS lowest_price,
    MAX(prod_price) AS highest_price,
    AVG(prod_price) AS average_price
FROM Products;
 
-- Categorized data
SELECT 
    vend_id,
    COUNT(*) AS product_count,
    AVG(prod_price) AS avg_price
FROM Products
GROUP BY vend_id
ORDER BY product_count DESC;

Data Sorting

Understanding Data Sorting

Without explicit sorting, data appears in unpredictable order. The sequence depends on how the database stores and manages data internally, which can change due to updates, deletions, and storage optimization.

Why Sorting Matters:

-- Without ORDER BY: Unpredictable order
SELECT prod_name
FROM Products;

Output (unpredictable):

prod_name
--------------------
Fish bean bag toy
Bird bean bag toy
Rabbit bean bag toy
8 inch teddy bear
12 inch teddy bear
18 inch teddy bear
Raggedy Ann
King doll
Queen doll

Key Principles:

  • No guaranteed order: Never assume data will appear in any particular sequence
  • Storage dependent: Order may reflect insertion sequence, but this can change
  • Explicit control required: Use ORDER BY to ensure predictable results
  • Relational theory: Unordered data is the default state in relational databases

Basic ORDER BY Syntax

Single Column Sorting:

-- Sort products alphabetically by name
SELECT prod_name
FROM Products
ORDER BY prod_name;

Output (predictable alphabetical order):

prod_name
--------------------
12 inch teddy bear
18 inch teddy bear
8 inch teddy bear
Bird bean bag toy
Fish bean bag toy
King doll
Queen doll
Rabbit bean bag toy
Raggedy Ann

Essential ORDER BY Rules:

  • Last clause: ORDER BY must be the final clause in SELECT statements
  • Column specification: Can sort by selected or non-selected columns
  • Case sensitivity: Depends on database configuration
  • Required for predictability: Only way to guarantee result order

ORDER BY Clause Position:

-- ✅ Correct: ORDER BY at the end
SELECT prod_name, prod_price
FROM Products
WHERE prod_price > 5
ORDER BY prod_name;
 
-- ❌ Error: ORDER BY not at the end
SELECT prod_name, prod_price
FROM Products
ORDER BY prod_name
WHERE prod_price > 5;  -- This will cause an error

Sorting by Non-Selected Columns:

-- Sort by a column not in the SELECT list
SELECT prod_name, prod_price
FROM Products
ORDER BY vend_id;  -- vend_id not selected but used for sorting
 
-- Practical example: Display customer names sorted by registration date
SELECT cust_name, cust_email
FROM Customers
ORDER BY cust_registration_date;

Sorting by Multiple Columns

Multi-Column Sort Logic:

-- Sort by price first, then by name within each price group
SELECT prod_id, prod_price, prod_name
FROM Products
ORDER BY prod_price, prod_name;

Output:

prod_id    prod_price    prod_name
-------    ----------    --------------------
BNBG02     3.4900        Bird bean bag toy
BNBG01     3.4900        Fish bean bag toy
BNBG03     3.4900        Rabbit bean bag toy
RGAN01     4.9900        Raggedy Ann
BR01       5.9900        8 inch teddy bear
BR02       8.9900        12 inch teddy bear
RYL01      9.4900        King doll
RYL02      9.4900        Queen doll
BR03       11.9900       18 inch teddy bear

Sort Priority Understanding:

  1. Primary sort: First column (prod_price) determines main order
  2. Secondary sort: Second column (prod_name) sorts within groups of equal primary values
  3. Tertiary sort: Additional columns continue the hierarchy

Complex Multi-Column Examples:

-- Customer report: Sort by state, then city, then name
SELECT cust_name, cust_city, cust_state
FROM Customers
ORDER BY cust_state, cust_city, cust_name;
 
-- Sales analysis: Sort by date, then customer, then amount
SELECT order_date, cust_id, order_total
FROM Orders
ORDER BY order_date, cust_id, order_total DESC;
 
-- Product catalog: Sort by category, then price (high to low), then name
SELECT prod_category, prod_price, prod_name
FROM Products
ORDER BY prod_category, prod_price DESC, prod_name;

When Secondary Sorting Matters:

-- Example showing why multiple columns matter
SELECT cust_state, cust_name
FROM Customers
ORDER BY cust_state, cust_name;
 
-- Without secondary sort, customers in same state appear randomly
-- With secondary sort, customers in same state are alphabetical

Sorting by Column Position

Using Numeric Positions:

-- Sort by column positions instead of names
SELECT prod_id, prod_price, prod_name
FROM Products
ORDER BY 2, 3;  -- Sort by 2nd column (prod_price), then 3rd (prod_name)

Position Reference:

  • 1: First column in SELECT list (prod_id)
  • 2: Second column in SELECT list (prod_price)
  • 3: Third column in SELECT list (prod_name)

Advantages and Disadvantages:

✅ Advantages❌ Disadvantages
Shorter syntaxError prone
Less typingHard to maintain
Quick for simple queriesUnclear intent

Best Practices for Column Positions:

-- ✅ Good: Use for simple, stable queries
SELECT prod_name, prod_price
FROM Products
ORDER BY 2 DESC, 1;  -- Price descending, name ascending
 
-- ❌ Avoid: Complex queries with many columns
SELECT prod_id, prod_name, prod_price, vend_id, prod_desc, prod_category
FROM Products
ORDER BY 3, 6, 2;  -- Hard to understand what's being sorted
 
-- ✅ Better: Use explicit column names for clarity
SELECT prod_id, prod_name, prod_price, vend_id, prod_desc, prod_category
FROM Products
ORDER BY prod_price, prod_category, prod_name;

Mixed Approach:

-- Combine column names and positions
SELECT prod_name, prod_price, vend_id
FROM Products
ORDER BY 2 DESC, vend_id, prod_name;  -- Mix positions and names

Specifying Sort Direction

ASC vs DESC Keywords:

-- Ascending order (default)
SELECT prod_name, prod_price
FROM Products
ORDER BY prod_price ASC;  -- ASC is optional (default)
 
-- Descending order
SELECT prod_name, prod_price
FROM Products
ORDER BY prod_price DESC;

Descending Sort Example:

-- Most expensive products first
SELECT prod_id, prod_price, prod_name
FROM Products
ORDER BY prod_price DESC;

Output:

prod_id    prod_price    prod_name
-------    ----------    --------------------
BR03       11.9900       18 inch teddy bear
RYL01      9.4900        King doll
RYL02      9.4900        Queen doll
BR02       8.9900        12 inch teddy bear
BR01       5.9900        8 inch teddy bear
RGAN01     4.9900        Raggedy Ann
BNBG01     3.4900        Fish bean bag toy
BNBG02     3.4900        Bird bean bag toy
BNBG03     3.4900        Rabbit bean bag toy

Multiple Columns with Different Directions:

-- Price descending, name ascending within each price
SELECT prod_id, prod_price, prod_name
FROM Products
ORDER BY prod_price DESC, prod_name ASC;

Output:

prod_id    prod_price    prod_name
-------    ----------    --------------------
BR03       11.9900       18 inch teddy bear
RYL01      9.4900        King doll
RYL02      9.4900        Queen doll
BR02       8.9900        12 inch teddy bear
BR01       5.9900        8 inch teddy bear
RGAN01     4.9900        Raggedy Ann
BNBG02     3.4900        Bird bean bag toy
BNBG01     3.4900        Fish bean bag toy
BNBG03     3.4900        Rabbit bean bag toy

Important Direction Rules:

-- ❌ Wrong: DESC applies only to preceding column
SELECT prod_name, prod_price, vend_id
FROM Products
ORDER BY prod_price, prod_name DESC;  -- Only prod_name is descending
 
-- ✅ Correct: Specify DESC for each column needing it
SELECT prod_name, prod_price, vend_id
FROM Products
ORDER BY prod_price DESC, prod_name DESC;  -- Both columns descending

Practical Sorting Examples

Business Reporting:

-- Sales report: Top customers by total orders
SELECT cust_name, COUNT(*) AS order_count, SUM(order_total) AS total_spent
FROM Customers c
JOIN Orders o ON c.cust_id = o.cust_id
GROUP BY cust_name
ORDER BY total_spent DESC, order_count DESC;
 
-- Product performance: Best sellers first
SELECT p.prod_name, SUM(oi.quantity) AS total_sold, AVG(oi.item_price) AS avg_price
FROM Products p
JOIN OrderItems oi ON p.prod_id = oi.prod_id
GROUP BY p.prod_name
ORDER BY total_sold DESC, avg_price DESC;

Data Analysis:

-- Customer segmentation: Sort by state, then by order value
SELECT cust_state, cust_name, 
       COALESCE(SUM(order_total), 0) AS lifetime_value
FROM Customers c
LEFT JOIN Orders o ON c.cust_id = o.cust_id
GROUP BY cust_state, cust_name
ORDER BY cust_state, lifetime_value DESC;
 
-- Inventory management: Low stock items first
SELECT prod_name, quantity_in_stock, reorder_level
FROM Products
ORDER BY quantity_in_stock ASC, reorder_level DESC;

Date and Time Sorting:

-- Recent orders first
SELECT order_num, order_date, cust_id, order_total
FROM Orders
ORDER BY order_date DESC, order_total DESC;
 
-- Chronological order with time
SELECT order_num, order_datetime, cust_id
FROM Orders
ORDER BY order_datetime ASC;
 
-- Monthly sales summary
SELECT YEAR(order_date) AS order_year, 
       MONTH(order_date) AS order_month,
       SUM(order_total) AS monthly_total
FROM Orders
GROUP BY YEAR(order_date), MONTH(order_date)
ORDER BY order_year DESC, order_month DESC;

Advanced Sorting Techniques

Conditional Sorting:

-- Sort by custom business logic
SELECT prod_name, prod_price, prod_category
FROM Products
ORDER BY 
    CASE prod_category
        WHEN 'Electronics' THEN 1
        WHEN 'Clothing' THEN 2
        WHEN 'Books' THEN 3
        ELSE 4
    END,
    prod_price DESC;

NULL Value Handling:

-- Control NULL sorting behavior
SELECT cust_name, cust_email
FROM Customers
ORDER BY 
    CASE WHEN cust_email IS NULL THEN 1 ELSE 0 END,  -- NULLs last
    cust_email,
    cust_name;
 
-- Database-specific NULL handling
-- PostgreSQL: NULLS FIRST or NULLS LAST
SELECT cust_name, cust_email
FROM Customers
ORDER BY cust_email NULLS LAST, cust_name;

String Sorting Considerations:

-- Case-insensitive sorting (database dependent)
SELECT cust_name
FROM Customers
ORDER BY UPPER(cust_name);  -- Convert to uppercase for sorting
 
-- Numeric sorting of string data
SELECT prod_code
FROM Products
ORDER BY CAST(SUBSTRING(prod_code, 5, 10) AS INTEGER);  -- Sort by numeric part

Performance Considerations

Index-Friendly Sorting:

-- ✅ Good: Sort by indexed columns
SELECT cust_name, cust_email
FROM Customers
ORDER BY cust_id;  -- Assuming cust_id is indexed (primary key)
 
-- ✅ Good: Composite index usage
CREATE INDEX idx_orders_date_customer ON Orders(order_date, cust_id);
SELECT order_num, order_date, cust_id
FROM Orders
ORDER BY order_date, cust_id;  -- Uses the composite index

Avoiding Performance Issues:

-- ❌ Slow: Sorting by calculated columns
SELECT prod_name, prod_price * 1.1 AS price_with_tax
FROM Products
ORDER BY prod_price * 1.1;  -- Calculation for every row
 
-- ✅ Better: Use column alias
SELECT prod_name, prod_price * 1.1 AS price_with_tax
FROM Products
ORDER BY price_with_tax;  -- Some databases optimize this
 
-- ❌ Slow: Sorting large result sets without LIMIT
SELECT *
FROM LargeTable
ORDER BY some_column;  -- Sorts millions of rows
 
-- ✅ Better: Limit results when possible
SELECT *
FROM LargeTable
ORDER BY some_column
LIMIT 100;  -- Only sort what you need

Challenge Solutions

Challenge 1: Customer names Z to A

-- Solution: Use DESC for descending alphabetical order
SELECT cust_name
FROM Customers
ORDER BY cust_name DESC;

Challenge 2: Customer ID and order number sorted

-- Solution: Sort by customer ID first, then order date descending
SELECT cust_id, order_num
FROM Orders
ORDER BY cust_id, order_date DESC;

Challenge 3: Quantity and price, highest first

-- Solution: Both columns need DESC for highest first
SELECT quantity, item_price
FROM OrderItems
ORDER BY quantity DESC, item_price DESC;

Challenge 4: What's wrong with this SQL?

-- ❌ Problem: Missing BY keyword and trailing comma
SELECT vend_name,
FROM Vendors
ORDER vend_name DESC;
 
-- ✅ Corrected version:
SELECT vend_name
FROM Vendors
ORDER BY vend_name DESC;

Errors in the original:

  1. Trailing comma after vend_name in SELECT
  2. Missing BY keyword in ORDER clause

Common ORDER BY Mistakes

Syntax Errors:

-- ❌ Wrong: ORDER BY not at end
SELECT prod_name
FROM Products
ORDER BY prod_name
WHERE prod_price > 10;
 
-- ❌ Wrong: Missing BY keyword
SELECT prod_name
FROM Products
ORDER prod_name;
 
-- ❌ Wrong: Invalid column reference
SELECT prod_name
FROM Products
ORDER BY prod_description;  -- Column not in table

Logic Errors:

-- ❌ Confusing: Inconsistent direction expectations
SELECT prod_name, prod_price
FROM Products
ORDER BY prod_price DESC, prod_name DESC;  -- Both descending might not be intended
 
-- ❌ Performance issue: Sorting without purpose
SELECT *
FROM LargeTable
ORDER BY random_column;  -- Unnecessary sorting of large dataset

Best Practices Summary

1. Always Use ORDER BY for Predictable Results:

-- ✅ Good: Explicit ordering
SELECT cust_name, cust_email
FROM Customers
ORDER BY cust_name;
 
-- ❌ Risky: Relying on default order
SELECT cust_name, cust_email
FROM Customers;  -- Order unpredictable

2. Choose Meaningful Sort Orders:

-- ✅ Good: Business-relevant sorting
SELECT prod_name, prod_price
FROM Products
ORDER BY prod_price DESC, prod_name;  -- Expensive first, then alphabetical
 
-- ✅ Good: User-friendly sorting
SELECT cust_name, cust_city, cust_state
FROM Customers
ORDER BY cust_state, cust_city, cust_name;  -- Geographic then alphabetical

3. Consider Performance:

-- ✅ Good: Use indexed columns when possible
SELECT order_num, order_date
FROM Orders
ORDER BY order_date DESC;  -- If order_date is indexed
 
-- ✅ Good: Limit large result sets
SELECT TOP 50 prod_name, prod_price
FROM Products
ORDER BY prod_price DESC;

4. Be Explicit About Direction:

-- ✅ Good: Clear intent
SELECT prod_name, prod_price
FROM Products
ORDER BY prod_price DESC, prod_name ASC;
 
-- ✅ Acceptable: Default ASC is clear
SELECT prod_name, prod_price
FROM Products
ORDER BY prod_price DESC, prod_name;

Data Filtering

Understanding Data Filtering

The WHERE clause specifies search criteria to filter data and retrieve only the rows that match your conditions. Without filtering, you retrieve all rows from a table, which is rarely what you need.

Why Filter Data:

BenefitDescriptionImpact
PerformanceRetrieve only needed dataFaster queries, less network traffic
RelevanceGet specific informationMore useful results
EfficiencyReduce processing overheadBetter resource utilization
AccuracyFocus on relevant recordsMore precise analysis

Database vs Application Filtering:

-- ✅ Good: Database-level filtering (efficient)
SELECT prod_name, prod_price
FROM Products
WHERE prod_price > 10;
 
-- ❌ Bad: Application-level filtering (inefficient)
-- Retrieve all data, then filter in application code
SELECT prod_name, prod_price
FROM Products;
-- Then filter in application: if (prod_price > 10) { ... }

Basic WHERE Clause Syntax

Standard WHERE Structure:

SELECT column_list
FROM table_name
WHERE condition;

Simple Equality Example:

-- Find products with specific price
SELECT prod_name, prod_price
FROM Products
WHERE prod_price = 3.49;

Output:

prod_name              prod_price
-------------------    ----------
Fish bean bag toy      3.49
Bird bean bag toy      3.49
Rabbit bean bag toy    3.49

Important WHERE Rules:

  • Position: WHERE comes after FROM, before ORDER BY
  • Required for filtering: Only way to limit rows returned
  • Case sensitivity: Depends on database configuration
  • Data type matching: Use appropriate comparison methods

Clause Order in SELECT Statements:

-- ✅ Correct order
SELECT prod_name, prod_price
FROM Products
WHERE prod_price > 5
ORDER BY prod_price;
 
-- ❌ Wrong order - will cause error
SELECT prod_name, prod_price
FROM Products
ORDER BY prod_price
WHERE prod_price > 5;

WHERE Clause Operators

Comparison Operators:

OperatorDescriptionExample
=EqualityWHERE price = 10
<>Not equalWHERE price {'<>'} 10
!=Not equal (alternative)WHERE price != 10
<Less thanWHERE price {'<'} 10
<=Less than or equalWHERE price {'<='} 10
>Greater thanWHERE price {'>'} 10
>=Greater than or equalWHERE price {'>='} 10
!<Not less thanWHERE price !{'<'} 10
!>Not greater thanWHERE price !{'>'} 10

Special Operators:

OperatorDescriptionExample
BETWEENRange checkWHERE price BETWEEN 5 AND 10
IS NULLNULL value checkWHERE email IS NULL
IS NOT NULLNot NULL checkWHERE email IS NOT NULL

Checking Against Single Values

Equality Testing:

-- Find specific product
SELECT prod_id, prod_name, prod_price
FROM Products
WHERE prod_id = 'BR01';
 
-- Find customers in specific state
SELECT cust_name, cust_city
FROM Customers
WHERE cust_state = 'CA';
 
-- Find orders on specific date
SELECT order_num, cust_id, order_total
FROM Orders
WHERE order_date = '2020-01-01';

Less Than Comparisons:

-- Products under $10
SELECT prod_name, prod_price
FROM Products
WHERE prod_price < 10;

Output:

prod_name              prod_price
-------------------    ----------
Fish bean bag toy      3.49
Bird bean bag toy      3.49
Rabbit bean bag toy    3.49
8 inch teddy bear      5.99
12 inch teddy bear     8.99
Raggedy Ann            4.99
King doll              9.49
Queen doll             9.49

Less Than or Equal:

-- Products $10 or less
SELECT prod_name, prod_price
FROM Products
WHERE prod_price <= 10;
 
-- Orders from last 30 days
SELECT order_num, order_date, cust_id
FROM Orders
WHERE order_date >= DATEADD(day, -30, GETDATE());  -- SQL Server
-- WHERE order_date >= DATE_SUB(NOW(), INTERVAL 30 DAY);  -- MySQL

Greater Than Comparisons:

-- Expensive products
SELECT prod_name, prod_price
FROM Products
WHERE prod_price > 15;
 
-- Large orders
SELECT order_num, order_total, cust_id
FROM Orders
WHERE order_total > 1000;
 
-- Recent customers
SELECT cust_name, cust_registration_date
FROM Customers
WHERE cust_registration_date > '2023-01-01';

Checking for Nonmatches

Not Equal Operators:

-- Products not from specific vendor
SELECT vend_id, prod_name
FROM Products
WHERE vend_id <> 'DLL01';

Output:

vend_id       prod_name
----------    ------------------
BRS01         8 inch teddy bear
BRS01         12 inch teddy bear
BRS01         18 inch teddy bear
FNG01         King doll
FNG01         Queen doll

Alternative Not Equal Syntax:

-- Same result using != operator
SELECT vend_id, prod_name
FROM Products
WHERE vend_id != 'DLL01';
 
-- Customers not in specific state
SELECT cust_name, cust_state
FROM Customers
WHERE cust_state != 'CA';
 
-- Orders not from specific customer
SELECT order_num, order_date, cust_id
FROM Orders
WHERE cust_id <> 'CUST001';

String vs Numeric Comparisons:

-- ✅ Correct: String values in quotes
SELECT cust_name
FROM Customers
WHERE cust_state = 'NY';
 
-- ✅ Correct: Numeric values without quotes
SELECT prod_name, prod_price
FROM Products
WHERE prod_price = 9.49;
 
-- ❌ Wrong: Numeric value in quotes (may work but inefficient)
SELECT prod_name, prod_price
FROM Products
WHERE prod_price = '9.49';
 
-- ❌ Wrong: String value without quotes (will cause error)
SELECT cust_name
FROM Customers
WHERE cust_state = NY;

Checking for Range of Values

BETWEEN Operator:

-- Products in price range
SELECT prod_name, prod_price
FROM Products
WHERE prod_price BETWEEN 5 AND 10;

Output:

prod_name              prod_price
-------------------    ----------
8 inch teddy bear      5.99
12 inch teddy bear     8.99
King doll              9.49
Queen doll             9.49

BETWEEN Characteristics:

  • Inclusive: Includes both boundary values
  • Requires AND: Must use AND between values
  • Order matters: Lower value first, then higher value

Date Range Examples:

-- Orders in date range
SELECT order_num, order_date, order_total
FROM Orders
WHERE order_date BETWEEN '2020-01-01' AND '2020-12-31';
 
-- Recent orders (last 7 days)
SELECT order_num, order_date, cust_id
FROM Orders
WHERE order_date BETWEEN DATEADD(day, -7, GETDATE()) AND GETDATE();  -- SQL Server
 
-- Quarterly sales
SELECT order_num, order_date, order_total
FROM Orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31';

Numeric Range Examples:

-- Medium-priced products
SELECT prod_name, prod_price, prod_category
FROM Products
WHERE prod_price BETWEEN 10 AND 50;
 
-- Order quantities in range
SELECT order_num, prod_id, quantity
FROM OrderItems
WHERE quantity BETWEEN 5 AND 20;
 
-- Customer age range (if age column exists)
SELECT cust_name, cust_age
FROM Customers
WHERE cust_age BETWEEN 25 AND 65;

Alternative to BETWEEN:

-- BETWEEN equivalent using AND
SELECT prod_name, prod_price
FROM Products
WHERE prod_price >= 5 AND prod_price <= 10;
 
-- Sometimes more readable for complex conditions
SELECT order_num, order_total
FROM Orders
WHERE order_total >= 100 AND order_total < 1000;

Checking for NULL Values

Understanding NULL:

  • NULL means "no value" - different from 0, empty string, or spaces
  • Cannot use = or != with NULL - special operators required
  • NULL comparisons always return unknown - not true or false

IS NULL Operator:

-- Find products with no price
SELECT prod_name
FROM Products
WHERE prod_price IS NULL;
 
-- Find customers with no email
SELECT cust_name
FROM Customers
WHERE cust_email IS NULL;

Output:

cust_name
----------
Kids Place
The Toy Store

IS NOT NULL Operator:

-- Find customers with email addresses
SELECT cust_name, cust_email
FROM Customers
WHERE cust_email IS NOT NULL;
 
-- Find products with descriptions
SELECT prod_name, prod_desc
FROM Products
WHERE prod_desc IS NOT NULL;
 
-- Complete customer records only
SELECT cust_name, cust_phone, cust_email
FROM Customers
WHERE cust_phone IS NOT NULL 
  AND cust_email IS NOT NULL;

NULL Behavior in Filtering:

-- ❌ Wrong: This won't find NULL values
SELECT cust_name
FROM Customers
WHERE cust_email = NULL;  -- Returns no results
 
-- ❌ Wrong: This also won't find NULL values
SELECT cust_name
FROM Customers
WHERE cust_email != 'something';  -- NULLs not included
 
-- ✅ Correct: Explicitly check for NULL
SELECT cust_name
FROM Customers
WHERE cust_email IS NULL;
 
-- ✅ Correct: Include NULLs in non-match queries
SELECT cust_name
FROM Customers
WHERE cust_email != 'gmail.com' OR cust_email IS NULL;

Advanced Filtering Examples

Business Logic Filtering:

-- Active customers with recent orders
SELECT c.cust_name, c.cust_email
FROM Customers c
WHERE c.cust_status = 'ACTIVE'
  AND c.last_order_date >= DATEADD(month, -6, GETDATE());
 
-- High-value products in stock
SELECT prod_name, prod_price, quantity_in_stock
FROM Products
WHERE prod_price > 50
  AND quantity_in_stock > 0;
 
-- Overdue orders
SELECT order_num, order_date, ship_date
FROM Orders
WHERE ship_date IS NULL
  AND order_date < DATEADD(day, -7, GETDATE());

Data Quality Checks:

-- Find incomplete customer records
SELECT cust_id, cust_name
FROM Customers
WHERE cust_email IS NULL
   OR cust_phone IS NULL
   OR cust_address IS NULL;
 
-- Find products with pricing issues
SELECT prod_id, prod_name, prod_price
FROM Products
WHERE prod_price IS NULL
   OR prod_price <= 0;
 
-- Find orders without items
SELECT o.order_num, o.order_date
FROM Orders o
WHERE NOT EXISTS (
    SELECT 1 FROM OrderItems oi 
    WHERE oi.order_num = o.order_num
);

Performance Optimization

Index-Friendly Filtering:

-- ✅ Good: Filter on indexed columns
SELECT cust_name, cust_email
FROM Customers
WHERE cust_id = 'CUST001';  -- Primary key lookup
 
-- ✅ Good: Range queries on indexed dates
SELECT order_num, order_total
FROM Orders
WHERE order_date >= '2024-01-01';  -- If order_date is indexed
 
-- ❌ Slow: Functions in WHERE clause
SELECT cust_name
FROM Customers
WHERE UPPER(cust_name) = 'JOHN DOE';  -- Can't use index
 
-- ✅ Better: Store data in searchable format
SELECT cust_name
FROM Customers
WHERE cust_name = 'John Doe';  -- Can use index if data is consistent

Efficient Filtering Patterns:

-- ✅ Good: Selective filters first
SELECT prod_name, prod_price
FROM Products
WHERE prod_category = 'Electronics'  -- Selective filter
  AND prod_price > 100;              -- Further refinement
 
-- ✅ Good: Use LIMIT with filtering
SELECT TOP 10 cust_name, order_total
FROM Customers c
JOIN Orders o ON c.cust_id = o.cust_id
WHERE order_total > 1000
ORDER BY order_total DESC;

Challenge Solutions

Challenge 1: Products with price 9.49

-- Solution: Simple equality filter
SELECT prod_id, prod_name
FROM Products
WHERE prod_price = 9.49;

Challenge 2: Products with price 9 or more

-- Solution: Greater than or equal filter
SELECT prod_id, prod_name
FROM Products
WHERE prod_price >= 9;

Challenge 3: Order numbers with 100+ items

-- Solution: Combine filtering with DISTINCT
SELECT DISTINCT order_num
FROM OrderItems
WHERE quantity >= 100;

Challenge 4: Products priced between 3 and 6, sorted

-- Solution: BETWEEN with ORDER BY
SELECT prod_name, prod_price
FROM Products
WHERE prod_price BETWEEN 3 AND 6
ORDER BY prod_price;
 
-- Alternative solution using comparison operators
SELECT prod_name, prod_price
FROM Products
WHERE prod_price >= 3 AND prod_price <= 6
ORDER BY prod_price;

Common WHERE Clause Mistakes

Syntax Errors:

-- ❌ Wrong: Missing quotes for string
SELECT cust_name
FROM Customers
WHERE cust_state = CA;
 
-- ❌ Wrong: Using = with NULL
SELECT cust_name
FROM Customers
WHERE cust_email = NULL;
 
-- ❌ Wrong: Incorrect BETWEEN syntax
SELECT prod_name
FROM Products
WHERE prod_price BETWEEN 5, 10;  -- Missing AND

Logic Errors:

-- ❌ Wrong: Not considering NULL values
SELECT cust_name
FROM Customers
WHERE cust_email != 'gmail.com';  -- Excludes NULLs
 
-- ✅ Correct: Include NULLs if intended
SELECT cust_name
FROM Customers
WHERE cust_email != 'gmail.com' OR cust_email IS NULL;

Best Practices Summary

1. Use Database-Level Filtering:

-- ✅ Good: Filter at database level
SELECT cust_name, cust_email
FROM Customers
WHERE cust_state = 'CA';
 
-- ❌ Avoid: Application-level filtering
-- SELECT cust_name, cust_email FROM Customers;
-- Then filter in application code

2. Choose Appropriate Operators:

-- ✅ Good: Use BETWEEN for ranges
SELECT prod_name, prod_price
FROM Products
WHERE prod_price BETWEEN 10 AND 50;
 
-- ✅ Good: Use IS NULL for NULL checks
SELECT cust_name
FROM Customers
WHERE cust_email IS NULL;

3. Consider Performance:

-- ✅ Good: Filter on indexed columns
SELECT order_num, order_date
FROM Orders
WHERE cust_id = 'CUST001';  -- If cust_id is indexed
 
-- ✅ Good: Use selective filters
SELECT prod_name
FROM Products
WHERE prod_category = 'Electronics'  -- Selective
  AND prod_price > 1000;             -- Further refinement

Advanced Data Filtering

Understanding Advanced Filtering

Advanced filtering allows you to combine multiple conditions to create sophisticated search criteria that go beyond simple single-condition filters. This enables precise data retrieval for complex business requirements.

Why Use Advanced Filtering:

BenefitDescriptionExample Use Case
PrecisionMultiple criteria for exact matchesFind customers in specific region with recent orders
FlexibilityAlternative conditions with ORProducts from multiple vendors or categories
EfficiencySingle query vs multiple queriesBetter performance than separate lookups
Business LogicComplex rules in databasePricing tiers, eligibility criteria, status combinations

Logical Operators Overview:

| Operator | Purpose | Syntax | Example |-----|-----|----- | AND | All conditions must be true | condition1 AND condition2 | price > 10 AND category = 'Electronics' | OR | Any condition can be true | condition1 OR condition2 | state = 'CA' OR state = 'NY' | IN | Match any value in list | column IN (value1, value2) | status IN ('Active', 'Pending') | NOT | Negate a condition | NOT condition | NOT category = 'Discontinued'

Combining WHERE Clauses with AND

The AND operator requires all conditions to be true for a row to be included in the results. This creates more restrictive filters.

Basic AND Syntax:

SELECT column_list
FROM table_name
WHERE condition1 AND condition2 AND condition3;

Simple AND Example:

-- Products from specific vendor under $4
SELECT prod_id, prod_price, prod_name
FROM Products
WHERE vend_id = 'DLL01' AND prod_price <= 4;

Output:

prod_id    prod_price    prod_name
-------    ----------    --------------------
BNBG02     3.4900        Bird bean bag toy
BNBG01     3.4900        Fish bean bag toy
BNBG03     3.4900        Rabbit bean bag toy

Multiple AND Conditions:

-- Customers in specific state with recent orders
SELECT cust_name, cust_city, cust_state
FROM Customers
WHERE cust_state = 'CA' 
  AND cust_country = 'USA'
  AND last_order_date >= '2024-01-01';
 
-- High-value orders from premium customers
SELECT order_num, order_date, order_total, cust_id
FROM Orders
WHERE order_total > 1000
  AND cust_status = 'Premium'
  AND order_date >= DATEADD(month, -3, GETDATE());
 
-- In-stock products in specific price range
SELECT prod_name, prod_price, quantity_in_stock
FROM Products
WHERE prod_price BETWEEN 10 AND 50
  AND quantity_in_stock > 0
  AND prod_category = 'Electronics';

Date and Time AND Filtering:

-- Orders from specific time period and customer type
SELECT order_num, order_date, cust_id, order_total
FROM Orders
WHERE order_date >= '2024-01-01'
  AND order_date < '2024-04-01'
  AND cust_type = 'Business'
  AND order_total > 500;
 
-- Recent active customers
SELECT cust_name, cust_email, registration_date
FROM Customers
WHERE registration_date >= DATEADD(year, -1, GETDATE())
  AND cust_status = 'Active'
  AND email_verified = 1;

Combining WHERE Clauses with OR

The OR operator includes rows where any condition is true. This creates more inclusive filters and provides alternative matching criteria.

Basic OR Syntax:

SELECT column_list
FROM table_name
WHERE condition1 OR condition2 OR condition3;

Simple OR Example:

-- Products from either vendor
SELECT prod_id, prod_price, prod_name
FROM Products
WHERE vend_id = 'DLL01' OR vend_id = 'BRS01';

Output:

prod_name              prod_price
-------------------    ----------
Fish bean bag toy      3.4900
Bird bean bag toy      3.4900
Rabbit bean bag toy    3.4900
8 inch teddy bear      5.9900
12 inch teddy bear     8.9900
18 inch teddy bear     11.9900
Raggedy Ann            4.9900

Multiple OR Conditions:

-- Customers from multiple states
SELECT cust_name, cust_city, cust_state
FROM Customers
WHERE cust_state = 'CA' 
   OR cust_state = 'NY' 
   OR cust_state = 'TX';
 
-- Orders with various status conditions
SELECT order_num, order_status, order_date
FROM Orders
WHERE order_status = 'Pending'
   OR order_status = 'Processing'
   OR order_status = 'Shipped';
 
-- Products in multiple categories
SELECT prod_name, prod_category, prod_price
FROM Products
WHERE prod_category = 'Electronics'
   OR prod_category = 'Computers'
   OR prod_category = 'Software';

OR with Different Columns:

-- Flexible customer search
SELECT cust_name, cust_email, cust_phone
FROM Customers
WHERE cust_city = 'Los Angeles'
   OR cust_email LIKE '%@gmail.com'
   OR order_total > 5000;
 
-- Product availability options
SELECT prod_name, prod_price, quantity_in_stock
FROM Products
WHERE quantity_in_stock > 100
   OR prod_price < 10
   OR prod_category = 'Clearance';

Understanding Order of Evaluation

SQL processes AND operators before OR operators (operator precedence), which can lead to unexpected results if not properly managed with parentheses.

Problematic Example Without Parentheses:

-- ❌ Problem: Incorrect grouping due to operator precedence
SELECT prod_name, prod_price
FROM Products
WHERE vend_id = 'DLL01' OR vend_id = 'BRS01'
      AND prod_price >= 10;

What SQL Actually Processes:

-- SQL interprets this as:
WHERE vend_id = 'DLL01' OR (vend_id = 'BRS01' AND prod_price >= 10)
-- This returns:
-- 1. ALL products from DLL01 (regardless of price)
-- 2. Products from BRS01 that cost $10 or more

Output (Incorrect):

prod_name              prod_price
-------------------    ----------
Fish bean bag toy      3.4900      -- DLL01, under $10 (unexpected)
Bird bean bag toy      3.4900      -- DLL01, under $10 (unexpected)
Rabbit bean bag toy    3.4900      -- DLL01, under $10 (unexpected)
18 inch teddy bear     11.9900     -- BRS01, over $10 (expected)
Raggedy Ann            4.9900      -- DLL01, under $10 (unexpected)

Correct Solution with Parentheses:

-- ✅ Correct: Explicit grouping with parentheses
SELECT prod_name, prod_price
FROM Products
WHERE (vend_id = 'DLL01' OR vend_id = 'BRS01')
      AND prod_price >= 10;

What SQL Processes:

-- SQL interprets this as:
WHERE (vend_id = 'DLL01' OR vend_id = 'BRS01') AND prod_price >= 10
-- This returns:
-- Products from EITHER vendor that cost $10 or more

Output (Correct):

prod_name              prod_price
-------------------    ----------
18 inch teddy bear     11.9900

Complex Grouping Examples:

-- Multiple condition groups
SELECT cust_name, cust_state, order_total
FROM Customers c
JOIN Orders o ON c.cust_id = o.cust_id
WHERE (cust_state = 'CA' OR cust_state = 'NY')
  AND (order_total > 1000 OR cust_type = 'Premium')
  AND order_date >= '2024-01-01';
 
-- Business logic with proper grouping
SELECT prod_name, prod_price, prod_category
FROM Products
WHERE (prod_category = 'Electronics' OR prod_category = 'Computers')
  AND (prod_price < 100 OR quantity_in_stock > 50)
  AND prod_status = 'Active';
 
-- Date range with multiple criteria
SELECT order_num, order_date, cust_id, order_total
FROM Orders
WHERE (order_date >= '2024-01-01' AND order_date < '2024-04-01')
  AND (order_total > 500 OR cust_type = 'VIP')
  AND (order_status = 'Completed' OR order_status = 'Shipped');

Best Practices for Operator Precedence:

-- ✅ Always use parentheses for clarity
SELECT *
FROM Products
WHERE (category = 'A' OR category = 'B')
  AND (price > 10 AND price < 100)
  AND status = 'Active';
 
-- ✅ Group related conditions
SELECT *
FROM Customers
WHERE (state = 'CA' OR state = 'NY' OR state = 'TX')
  AND (customer_type = 'Premium' OR total_orders > 10)
  AND registration_date >= '2023-01-01';
 
-- ❌ Avoid relying on default precedence
SELECT *
FROM Products
WHERE category = 'A' OR category = 'B' AND price > 10;  -- Unclear intent

Using the IN Operator

The IN operator provides a clean way to test against multiple values, functioning as a more readable alternative to multiple OR conditions.

Basic IN Syntax:

SELECT column_list
FROM table_name
WHERE column_name IN (value1, value2, value3, ...);

Simple IN Example:

-- Products from specific vendors
SELECT prod_name, prod_price
FROM Products
WHERE vend_id IN ('DLL01', 'BRS01')
ORDER BY prod_name;

Output:

prod_name              prod_price
-------------------    ----------
12 inch teddy bear     8.9900
18 inch teddy bear     11.9900
8 inch teddy bear      5.9900
Bird bean bag toy      3.4900
Fish bean bag toy      3.4900
Rabbit bean bag toy    3.4900
Raggedy Ann            4.9900

IN vs OR Comparison:

-- Using IN (cleaner, more readable)
SELECT prod_name, prod_price
FROM Products
WHERE vend_id IN ('DLL01', 'BRS01', 'FNG01', 'ABC01')
ORDER BY prod_name;
 
-- Equivalent using OR (more verbose)
SELECT prod_name, prod_price
FROM Products
WHERE vend_id = 'DLL01' 
   OR vend_id = 'BRS01' 
   OR vend_id = 'FNG01' 
   OR vend_id = 'ABC01'
ORDER BY prod_name;

Advanced IN Examples:

-- Multiple status values
SELECT order_num, order_status, order_date
FROM Orders
WHERE order_status IN ('Pending', 'Processing', 'Shipped', 'Delivered');
 
-- Specific customer types
SELECT cust_name, cust_type, cust_city
FROM Customers
WHERE cust_type IN ('Premium', 'VIP', 'Corporate')
  AND cust_state IN ('CA', 'NY', 'TX');
 
-- Product categories
SELECT prod_name, prod_category, prod_price
FROM Products
WHERE prod_category IN ('Electronics', 'Computers', 'Software', 'Hardware')
  AND prod_price BETWEEN 50 AND 500;
 
-- Numeric values
SELECT emp_name, emp_department, emp_salary
FROM Employees
WHERE emp_department_id IN (10, 20, 30, 40)
  AND emp_salary > 50000;

IN with Subqueries (Preview):

-- Products from vendors in specific states
SELECT prod_name, prod_price
FROM Products
WHERE vend_id IN (
    SELECT vend_id 
    FROM Vendors 
    WHERE vend_state IN ('CA', 'NY')
);
 
-- Orders from customers with high total purchases
SELECT order_num, order_date, order_total
FROM Orders
WHERE cust_id IN (
    SELECT cust_id 
    FROM Customers 
    WHERE total_purchases > 10000
);

Advantages of IN Operator:

AdvantageDescriptionExample
ReadabilityCleaner syntax than multiple ORsstatus IN ('A', 'B', 'C') vs status = 'A' OR status = 'B' OR status = 'C'
PerformanceOften faster than OR chainsBetter optimization by database engine
MaintainabilityEasier to modify value listsAdd/remove values in single location
Subquery SupportCan use with SELECT statementsDynamic value lists from other tables
Evaluation OrderClearer precedence with other operatorsLess ambiguity in complex WHERE clauses

Using the NOT Operator

The NOT operator negates conditions, allowing you to find rows that don't match specified criteria. It's particularly powerful when combined with other operators.

Basic NOT Syntax:

SELECT column_list
FROM table_name
WHERE NOT condition;

Simple NOT Example:

-- Products NOT from specific vendor
SELECT prod_name
FROM Products
WHERE NOT vend_id = 'DLL01'
ORDER BY prod_name;

Output:

prod_name
------------------
12 inch teddy bear
18 inch teddy bear
8 inch teddy bear
King doll
Queen doll

NOT with Different Operators:

-- NOT with equality (same as <> or !=)
SELECT prod_name
FROM Products
WHERE NOT vend_id = 'DLL01';
 
-- Equivalent using <> operator
SELECT prod_name
FROM Products
WHERE vend_id <> 'DLL01';
 
-- NOT with IN operator
SELECT cust_name, cust_state
FROM Customers
WHERE cust_state NOT IN ('CA', 'NY', 'TX');
 
-- NOT with BETWEEN
SELECT prod_name, prod_price
FROM Products
WHERE prod_price NOT BETWEEN 10 AND 50;
 
-- NOT with LIKE (pattern matching)
SELECT cust_name, cust_email
FROM Customers
WHERE cust_email NOT LIKE '%@gmail.com';

Advanced NOT Examples:

-- Customers without recent orders
SELECT c.cust_name, c.cust_email
FROM Customers c
WHERE c.cust_id NOT IN (
    SELECT DISTINCT cust_id 
    FROM Orders 
    WHERE order_date >= DATEADD(month, -6, GETDATE())
);
 
-- Products not in specific categories
SELECT prod_name, prod_category, prod_price
FROM Products
WHERE prod_category NOT IN ('Discontinued', 'Clearance', 'Seasonal')
  AND quantity_in_stock > 0;
 
-- Orders not from specific regions
SELECT order_num, order_date, cust_id
FROM Orders o
JOIN Customers c ON o.cust_id = c.cust_id
WHERE c.cust_state NOT IN ('CA', 'NY', 'FL')
  AND o.order_total > 100;

NOT with Complex Conditions:

-- NOT with grouped conditions
SELECT prod_name, prod_price, vend_id
FROM Products
WHERE NOT (vend_id = 'DLL01' AND prod_price > 10);
 
-- This is equivalent to:
SELECT prod_name, prod_price, vend_id
FROM Products
WHERE vend_id != 'DLL01' OR prod_price <= 10;
 
-- NOT with multiple conditions
SELECT cust_name, cust_state, cust_type
FROM Customers
WHERE NOT (cust_state = 'CA' AND cust_type = 'Individual')
  AND cust_status = 'Active';

NOT vs Alternative Operators:

-- NOT with equality
SELECT * FROM Products WHERE NOT prod_category = 'Electronics';
-- Same as:
SELECT * FROM Products WHERE prod_category <> 'Electronics';
 
-- NOT with IN
SELECT * FROM Customers WHERE cust_state NOT IN ('CA', 'NY');
-- Same as:
SELECT * FROM Customers WHERE cust_state <> 'CA' AND cust_state <> 'NY';
 
-- NOT with BETWEEN
SELECT * FROM Products WHERE prod_price NOT BETWEEN 10 AND 50;
-- Same as:
SELECT * FROM Products WHERE prod_price < 10 OR prod_price > 50;

Complex Filtering Combinations

Real-world scenarios often require combining multiple logical operators to create sophisticated filtering logic that matches business requirements.

Business Logic Examples:

-- E-commerce product search
SELECT prod_name, prod_price, prod_category, vend_name
FROM Products p
JOIN Vendors v ON p.vend_id = v.vend_id
WHERE (prod_category IN ('Electronics', 'Computers', 'Software'))
  AND (prod_price BETWEEN 50 AND 500)
  AND (quantity_in_stock > 0)
  AND NOT (prod_status = 'Discontinued')
ORDER BY prod_price;
 
-- Customer segmentation
SELECT cust_name, cust_type, total_orders, total_spent
FROM Customers
WHERE (cust_type = 'Premium' OR total_spent > 5000)
  AND (total_orders >= 10 OR registration_date >= '2023-01-01')
  AND cust_status = 'Active'
  AND cust_email IS NOT NULL;
 
-- Order fulfillment analysis
SELECT o.order_num, o.order_date, o.order_total, c.cust_name
FROM Orders o
JOIN Customers c ON o.cust_id = c.cust_id
WHERE (o.order_status IN ('Processing', 'Shipped'))
  AND (o.order_total > 100 OR c.cust_type = 'VIP')
  AND o.order_date >= DATEADD(day, -30, GETDATE())
  AND NOT (o.shipping_method = 'Standard' AND o.order_total < 50);

Data Quality and Validation:

-- Find incomplete customer records
SELECT cust_id, cust_name, cust_email, cust_phone
FROM Customers
WHERE (cust_email IS NULL OR cust_email = '')
   OR (cust_phone IS NULL OR cust_phone = '')
   OR (cust_address IS NULL OR cust_address = '')
  AND cust_status = 'Active';
 
-- Identify problematic orders
SELECT order_num, order_date, order_total, order_status
FROM Orders
WHERE (order_total <= 0 OR order_total IS NULL)
   OR (order_date > GETDATE())
   OR (order_status NOT IN ('Pending', 'Processing', 'Shipped', 'Delivered', 'Cancelled'))
   OR (cust_id IS NULL);
 
-- Product inventory issues
SELECT prod_id, prod_name, quantity_in_stock, prod_price
FROM Products
WHERE (quantity_in_stock < 0)
   OR (prod_price <= 0 OR prod_price IS NULL)
   OR (prod_name IS NULL OR prod_name = '')
  AND prod_status = 'Active';

Performance Optimization Patterns:

-- ✅ Good: Most selective conditions first
SELECT cust_name, order_total
FROM Customers c
JOIN Orders o ON c.cust_id = o.cust_id
WHERE o.order_date >= '2024-01-01'  -- Date range (selective)
  AND c.cust_type = 'Premium'       -- Customer type (selective)
  AND (o.order_total > 1000 OR c.total_lifetime_value > 10000);
 
-- ✅ Good: Use indexes effectively
SELECT prod_name, prod_price
FROM Products
WHERE prod_category = 'Electronics'  -- Indexed column
  AND prod_price BETWEEN 100 AND 500 -- Range on indexed column
  AND quantity_in_stock > 0;
 
-- ❌ Avoid: Functions in WHERE clause
SELECT cust_name
FROM Customers
WHERE UPPER(cust_name) LIKE 'JOHN%';  -- Can't use index
 
-- ✅ Better: Store data consistently
SELECT cust_name
FROM Customers
WHERE cust_name LIKE 'John%';  -- Can use index

Challenge Solutions

Challenge 1: California vendors

-- Find vendors in California, USA
SELECT vend_name
FROM Vendors
WHERE vend_country = 'USA' AND vend_state = 'CA';

Challenge 2: Large orders for specific products

-- Orders with 100+ items of BR01, BR02, or BR03
SELECT order_num, prod_id, quantity
FROM OrderItems
WHERE prod_id IN ('BR01', 'BR02', 'BR03')
  AND quantity >= 100;
 
-- Alternative with proper grouping
SELECT order_num, prod_id, quantity
FROM OrderItems
WHERE (prod_id = 'BR01' OR prod_id = 'BR02' OR prod_id = 'BR03')
  AND quantity >= 100;

Challenge 3: Price range with AND

-- Products between $3 and $6 using AND
SELECT prod_name, prod_price
FROM Products
WHERE prod_price >= 3 AND prod_price <= 6
ORDER BY prod_price;

Challenge 4: SQL statement error

-- ❌ Wrong: ORDER BY before WHERE
SELECT vend_name
FROM Vendors
ORDER BY vend_name
WHERE vend_country = 'USA' AND vend_state = 'CA';
 
-- ✅ Correct: WHERE before ORDER BY
SELECT vend_name
FROM Vendors
WHERE vend_country = 'USA' AND vend_state = 'CA'
ORDER BY vend_name;

Error Explanation: The WHERE clause must come before the ORDER BY clause. SQL processes clauses in a specific order: FROM → WHERE → ORDER BY → SELECT.

Common Advanced Filtering Mistakes

Operator Precedence Errors:

-- ❌ Wrong: Unclear precedence
SELECT * FROM Products
WHERE category = 'A' OR category = 'B' AND price > 10;
 
-- ✅ Correct: Clear grouping
SELECT * FROM Products
WHERE (category = 'A' OR category = 'B') AND price > 10;

NULL Handling Issues:

-- ❌ Wrong: NOT doesn't handle NULLs as expected
SELECT cust_name FROM Customers
WHERE NOT cust_email = 'test@example.com';  -- Excludes NULLs
 
-- ✅ Correct: Explicitly handle NULLs
SELECT cust_name FROM Customers
WHERE cust_email != 'test@example.com' OR cust_email IS NULL;

IN vs OR Performance:

-- ✅ Good: Use IN for multiple values
SELECT * FROM Products
WHERE category IN ('A', 'B', 'C', 'D', 'E');
 
-- ❌ Less efficient: Long OR chains
SELECT * FROM Products
WHERE category = 'A' OR category = 'B' OR category = 'C' OR category = 'D' OR category = 'E';

Best Practices Summary

1. Always Use Parentheses:

-- ✅ Clear and unambiguous
SELECT * FROM Products
WHERE (category = 'Electronics' OR category = 'Computers')
  AND (price > 100 AND price < 1000)
  AND status = 'Active';

2. Choose Appropriate Operators:

-- ✅ Use IN for multiple values
WHERE status IN ('Active', 'Pending', 'Processing');
 
-- ✅ Use BETWEEN for ranges
WHERE price BETWEEN 10 AND 100;
 
-- ✅ Use NOT for negation
WHERE category NOT IN ('Discontinued', 'Clearance');

3. Consider Performance:

-- ✅ Most selective conditions first
WHERE date_column >= '2024-01-01'  -- Selective
  AND category = 'Electronics'     -- Selective
  AND (price > 100 OR status = 'Premium');  -- Less selective

4. Handle NULLs Explicitly:

-- ✅ Account for NULL values
WHERE (email != 'spam@example.com' OR email IS NULL)
  AND status = 'Active';

Wildcard Filtering

Understanding Wildcard Filtering

Wildcard filtering enables pattern-based searching when you need to find data that matches partial or flexible criteria, rather than exact values. This is essential for text searches, data exploration, and flexible filtering scenarios.

When to Use Wildcards:

ScenarioExampleWildcard Solution
Partial Text SearchFind products containing "bean bag"LIKE '%bean bag%'
Prefix MatchingNames starting with "John"LIKE 'John%'
Suffix MatchingFiles ending with ".pdf"LIKE '%.pdf'
Pattern MatchingPhone numbers like "555-xxxx"LIKE '555-____'
Character SetsNames starting with J or MLIKE '[JM]%'

Wildcard vs Standard Operators:

-- ✅ Standard operators: Known exact values
SELECT * FROM Products WHERE prod_price = 9.99;
SELECT * FROM Customers WHERE cust_state = 'CA';
 
-- ✅ Wildcards: Pattern-based matching
SELECT * FROM Products WHERE prod_name LIKE '%toy%';
SELECT * FROM Customers WHERE cust_email LIKE '%@gmail.com';

Key Concepts:

  • Wildcards: Special characters with pattern-matching meanings
  • Search Patterns: Combinations of literal text and wildcards
  • LIKE Operator: Required for wildcard pattern matching
  • Text Fields Only: Wildcards work only with string/text data types

The LIKE Operator

LIKE is the foundation of wildcard filtering, instructing the database to perform pattern matching rather than exact equality comparisons.

Basic LIKE Syntax:

SELECT column_list
FROM table_name
WHERE column_name LIKE 'pattern';

LIKE vs Equality Comparison:

-- Exact match (equality)
SELECT prod_name FROM Products
WHERE prod_name = 'Fish bean bag toy';  -- Only exact matches
 
-- Pattern match (LIKE)
SELECT prod_name FROM Products
WHERE prod_name LIKE 'Fish%';  -- Starts with "Fish"

Important LIKE Characteristics:

CharacteristicDescriptionExample
Case SensitivityDepends on database configuration'fish%' may not match 'Fish bean bag toy'
Text Fields OnlyCannot use with numeric/date fieldsWHERE price LIKE '10%' is invalid
Pattern RequiredMust include wildcard charactersLIKE 'exact' same as = 'exact'
NULL HandlingWildcards don't match NULL valuesLIKE '%' excludes NULL rows

The Percent Sign (%) Wildcard

The % wildcard matches zero, one, or more characters at the specified position, making it the most versatile and commonly used wildcard.

Basic % Usage:

-- Products starting with "Fish"
SELECT prod_id, prod_name
FROM Products
WHERE prod_name LIKE 'Fish%';

Output:

prod_id    prod_name
-------    ------------------
BNBG01     Fish bean bag toy

% Wildcard Positions:

-- Starts with pattern
SELECT prod_name FROM Products
WHERE prod_name LIKE 'Fish%';
-- Matches: "Fish bean bag toy", "Fishing rod", "Fish"
 
-- Ends with pattern
SELECT prod_name FROM Products
WHERE prod_name LIKE '%toy';
-- Matches: "Fish bean bag toy", "Teddy bear toy", "toy"
 
-- Contains pattern anywhere
SELECT prod_name FROM Products
WHERE prod_name LIKE '%bean bag%';
-- Matches: "Fish bean bag toy", "Large bean bag chair"
 
-- Multiple patterns
SELECT prod_name FROM Products
WHERE prod_name LIKE 'F%y';
-- Matches: "Fish bean bag toy", "Funny", "Fly"

Advanced % Examples:

-- Email domain filtering
SELECT cust_name, cust_email
FROM Customers
WHERE cust_email LIKE '%@gmail.com';
 
-- Product code patterns
SELECT prod_id, prod_name
FROM Products
WHERE prod_id LIKE 'BR%';  -- All products with codes starting "BR"
 
-- File extension matching
SELECT file_name, file_size
FROM Documents
WHERE file_name LIKE '%.pdf';
 
-- Multiple word search
SELECT prod_name, prod_desc
FROM Products
WHERE prod_desc LIKE '%wireless%'
  AND prod_desc LIKE '%bluetooth%';

% Wildcard with Multiple Patterns:

-- Products containing "bean bag" anywhere
SELECT prod_id, prod_name
FROM Products
WHERE prod_name LIKE '%bean bag%';

Output:

prod_id     prod_name
--------    --------------------
BNBG01      Fish bean bag toy
BNBG02      Bird bean bag toy
BNBG03      Rabbit bean bag toy

Complex % Patterns:

-- Names with middle initial pattern
SELECT cust_name
FROM Customers
WHERE cust_name LIKE '% _. %';  -- "John A. Smith"
 
-- Product codes with specific format
SELECT prod_id, prod_name
FROM Products
WHERE prod_id LIKE '__-____%';  -- "AB-1234X", "CD-5678Y"
 
-- Partial email search
SELECT cust_name, cust_email
FROM Customers
WHERE cust_email LIKE 'b%@forta.com';  -- Emails starting with 'b' at forta.com

% Wildcard Characteristics:

FeatureDescriptionExample
Zero CharactersMatches empty string'Fish%' matches 'Fish'
One CharacterMatches single character'Fish%' matches 'Fishy'
Multiple CharactersMatches any length'Fish%' matches 'Fish bean bag toy'
Position FlexibleCan be anywhere in pattern'%fish%', 'fish%', '%fish'

The Underscore (_) Wildcard

The _ wildcard matches exactly one character - no more, no less. This provides precise control over pattern length and character positions.

Basic _ Usage:

-- Products with exactly 2 characters before "inch teddy bear"
SELECT prod_id, prod_name
FROM Products
WHERE prod_name LIKE '__ inch teddy bear';

Output:

prod_id     prod_name
--------    --------------------
BR02        12 inch teddy bear
BR03        18 inch teddy bear

Note: "8 inch teddy bear" doesn't match because it has only 1 character before "inch", but the pattern requires exactly 2.

_ vs % Comparison:

-- Underscore: exactly one character
SELECT prod_id, prod_name
FROM Products
WHERE prod_name LIKE '__ inch teddy bear';
-- Matches: "12 inch teddy bear", "18 inch teddy bear"
-- Doesn't match: "8 inch teddy bear" (only 1 character)
 
-- Percent: any number of characters
SELECT prod_id, prod_name
FROM Products
WHERE prod_name LIKE '% inch teddy bear';
-- Matches: "8 inch teddy bear", "12 inch teddy bear", "18 inch teddy bear"

Output for % version:

prod_id     prod_name
--------    --------------------
BR01        8 inch teddy bear
BR02        12 inch teddy bear
BR03        18 inch teddy bear

Practical _ Examples:

-- Phone number patterns
SELECT cust_name, cust_phone
FROM Customers
WHERE cust_phone LIKE '555-____';  -- 555- followed by exactly 4 digits
 
-- Product codes with fixed format
SELECT prod_id, prod_name
FROM Products
WHERE prod_id LIKE 'BR__';  -- BR followed by exactly 2 characters
 
-- Date patterns (if stored as text)
SELECT order_num, order_date
FROM Orders
WHERE order_date LIKE '2024-__-01';  -- First day of any month in 2024
 
-- License plate patterns
SELECT vehicle_id, license_plate
FROM Vehicles
WHERE license_plate LIKE '___-____';  -- 3 chars, dash, 4 chars

Multiple _ Wildcards:

-- Exact character count matching
SELECT prod_name
FROM Products
WHERE prod_name LIKE '___ inch ____';  -- 3 chars + " inch " + 4 chars
 
-- Fixed-length codes
SELECT cust_id, cust_name
FROM Customers
WHERE cust_id LIKE 'C____';  -- C followed by exactly 4 characters
 
-- Pattern with mixed wildcards
SELECT prod_name
FROM Products
WHERE prod_name LIKE '_% inch %';  -- 1 char + any chars + " inch " + any chars

_ Wildcard Characteristics:

FeatureDescriptionExample
Exact CountAlways matches exactly one character'_' never matches 0 or 2+ characters
Position SpecificEach _ represents one position'__' = exactly 2 characters
Cannot SkipMust have character at each _ position'A_C' won't match 'AC'
Combine with %Mix for flexible patterns'__%' = at least 2 characters

The Brackets ([]) Wildcard

The [] wildcard specifies a character set where any single character from the set can match at that position. This provides precise control over allowed characters.

Database Support:

DatabaseSupportNotes
SQL Server✅ Full supportIncludes negation with ^
MySQL❌ Not supportedUse REGEXP instead
Oracle❌ Not supportedUse REGEXP_LIKE instead
PostgreSQL❌ Not supportedUse SIMILAR TO or REGEXP
SQLite❌ Not supportedLimited pattern matching

Basic [] Usage (SQL Server):

-- Names starting with J or M
SELECT cust_contact
FROM Customers
WHERE cust_contact LIKE '[JM]%'
ORDER BY cust_contact;

Output:

cust_contact
-----------------
Jim Jones
John Smith
Michelle Green

Character Set Examples:

-- Specific letters
SELECT cust_contact
FROM Customers
WHERE cust_contact LIKE '[ABCD]%';  -- Starts with A, B, C, or D
 
-- Numeric ranges
SELECT prod_id
FROM Products
WHERE prod_id LIKE 'BR[0-9][0-9]';  -- BR followed by 2 digits
 
-- Mixed character sets
SELECT cust_name
FROM Customers
WHERE cust_name LIKE '[A-Z][a-z]%';  -- Uppercase + lowercase + any chars
 
-- Multiple character sets
SELECT prod_code
FROM Products
WHERE prod_code LIKE '[A-Z][A-Z][0-9][0-9]';  -- 2 letters + 2 digits

Negated Character Sets with ^:

-- Names NOT starting with J or M
SELECT cust_contact
FROM Customers
WHERE cust_contact LIKE '[^JM]%'
ORDER BY cust_contact;
 
-- Product codes NOT starting with specific letters
SELECT prod_id, prod_name
FROM Products
WHERE prod_id LIKE '[^AB]%';  -- Doesn't start with A or B
 
-- Non-numeric first character
SELECT cust_id
FROM Customers
WHERE cust_id LIKE '[^0-9]%';  -- Doesn't start with digit

Alternative to [] with NOT:

-- Using NOT operator instead of ^
SELECT cust_contact
FROM Customers
WHERE NOT cust_contact LIKE '[JM]%'
ORDER BY cust_contact;
 
-- Multiple NOT conditions
SELECT cust_contact
FROM Customers
WHERE cust_contact NOT LIKE 'J%'
  AND cust_contact NOT LIKE 'M%'
ORDER BY cust_contact;

Cross-Database Alternatives:

-- MySQL: Using REGEXP
SELECT cust_contact
FROM Customers
WHERE cust_contact REGEXP '^[JM]';
 
-- PostgreSQL: Using SIMILAR TO
SELECT cust_contact
FROM Customers
WHERE cust_contact SIMILAR TO '[JM]%';
 
-- Oracle: Using REGEXP_LIKE
SELECT cust_contact
FROM Customers
WHERE REGEXP_LIKE(cust_contact, '^[JM]');

Advanced Wildcard Combinations

Real-world scenarios often require combining multiple wildcards to create sophisticated pattern matching for complex business requirements.

Complex Pattern Examples:

-- Email validation patterns
SELECT cust_name, cust_email
FROM Customers
WHERE cust_email LIKE '%_@_%._%';  -- Basic email format
 
-- Product code validation
SELECT prod_id, prod_name
FROM Products
WHERE prod_id LIKE '[A-Z][A-Z][0-9][0-9][0-9]';  -- 2 letters + 3 digits
 
-- Phone number patterns
SELECT cust_name, cust_phone
FROM Customers
WHERE cust_phone LIKE '([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]';
 
-- File naming conventions
SELECT file_name, file_type
FROM Documents
WHERE file_name LIKE 'DOC_____-____.pdf';  -- DOC + 5 chars + dash + 4 chars + .pdf

Business Logic Patterns:

-- Customer ID format validation
SELECT cust_id, cust_name
FROM Customers
WHERE cust_id LIKE 'CUST[0-9][0-9][0-9][0-9]'  -- CUST + 4 digits
   OR cust_id LIKE 'VIP[0-9][0-9][0-9]';        -- VIP + 3 digits
 
-- Product search with multiple criteria
SELECT prod_name, prod_desc
FROM Products
WHERE (prod_name LIKE '%wireless%' OR prod_desc LIKE '%wireless%')
  AND (prod_name LIKE '%mouse%' OR prod_desc LIKE '%mouse%');
 
-- Address format checking
SELECT cust_name, cust_address
FROM Customers
WHERE cust_address LIKE '[0-9]% % St'     -- Number + street name + St
   OR cust_address LIKE '[0-9]% % Ave'    -- Number + street name + Ave
   OR cust_address LIKE '[0-9]% % Blvd';  -- Number + street name + Blvd

Data Quality and Validation:

-- Find inconsistent data formats
SELECT cust_id, cust_phone
FROM Customers
WHERE cust_phone NOT LIKE '([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]'
  AND cust_phone IS NOT NULL;
 
-- Identify potential data entry errors
SELECT prod_id, prod_name
FROM Products
WHERE prod_name LIKE '%  %'  -- Double spaces
   OR prod_name LIKE ' %'    -- Leading space
   OR prod_name LIKE '% ';   -- Trailing space
 
-- Check for incomplete records
SELECT cust_id, cust_name, cust_email
FROM Customers
WHERE cust_email NOT LIKE '%@%.%'  -- Invalid email format
   OR cust_email LIKE '%@%@%';     -- Multiple @ symbols

Performance Considerations

Wildcard searches can be expensive, especially when wildcards appear at the beginning of patterns. Understanding performance implications helps optimize queries.

Performance Impact by Pattern Type:

| Pattern Type | Performance | Index Usage | Example |-----|-----|----- | Prefix (suffix %) | ✅ Fast | Can use index | 'John%' | Suffix (prefix %) | ❌ Slow | Cannot use index | '%Smith' | Contains (% both sides) | ❌ Slowest | Cannot use index | '%John%' | Fixed position (_) | ⚠️ Moderate | Limited index use | 'J_hn'

Optimization Strategies:

-- ✅ Good: Prefix matching (can use index)
SELECT cust_name
FROM Customers
WHERE cust_name LIKE 'Smith%';
 
-- ❌ Slow: Leading wildcard (full table scan)
SELECT cust_name
FROM Customers
WHERE cust_name LIKE '%Smith';
 
-- ✅ Better: Use additional criteria to limit scope
SELECT cust_name
FROM Customers
WHERE cust_state = 'CA'  -- Indexed column first
  AND cust_name LIKE '%Smith%';
 
-- ✅ Good: Combine with selective filters
SELECT prod_name, prod_price
FROM Products
WHERE prod_category = 'Electronics'  -- Selective filter
  AND prod_name LIKE '%wireless%';   -- Then wildcard search

Alternative Approaches for Better Performance:

-- Full-text search (if available)
SELECT prod_name, prod_desc
FROM Products
WHERE CONTAINS(prod_desc, 'wireless bluetooth');
 
-- Separate search columns
SELECT cust_name
FROM Customers
WHERE cust_name_search LIKE '%SMITH%';  -- Pre-processed uppercase column
 
-- Use LIMIT to control result size
SELECT TOP 100 prod_name
FROM Products
WHERE prod_name LIKE '%toy%'
ORDER BY prod_name;

Common Wildcard Pitfalls

Trailing Spaces Issue:

-- ❌ Problem: Trailing spaces prevent matches
SELECT prod_name
FROM Products
WHERE prod_name LIKE 'F%y';  -- Won't match "Fishy    " (with trailing spaces)
 
-- ✅ Solution: Add trailing wildcard
SELECT prod_name
FROM Products
WHERE prod_name LIKE 'F%y%';  -- Matches with trailing spaces
 
-- ✅ Better: Use TRIM function
SELECT prod_name
FROM Products
WHERE TRIM(prod_name) LIKE 'F%y';

Case Sensitivity Issues:

-- ❌ May not work: Case sensitive search
SELECT cust_name
FROM Customers
WHERE cust_name LIKE 'john%';  -- Won't match "John Smith"
 
-- ✅ Solution: Use UPPER/LOWER functions
SELECT cust_name
FROM Customers
WHERE UPPER(cust_name) LIKE 'JOHN%';
 
-- ✅ Alternative: Case-insensitive collation (SQL Server)
SELECT cust_name
FROM Customers
WHERE cust_name COLLATE SQL_Latin1_General_CP1_CI_AS LIKE 'john%';

NULL Value Handling:

-- ❌ Problem: Wildcards don't match NULL
SELECT cust_name
FROM Customers
WHERE cust_email LIKE '%';  -- Excludes rows where cust_email IS NULL
 
-- ✅ Solution: Explicitly handle NULLs
SELECT cust_name
FROM Customers
WHERE cust_email LIKE '%' OR cust_email IS NULL;
 
-- ✅ Alternative: Use COALESCE
SELECT cust_name
FROM Customers
WHERE COALESCE(cust_email, '') LIKE '%';

Challenge Solutions

Challenge 1: Products with "toy" in description

-- Find products containing "toy" in description
SELECT prod_name, prod_desc
FROM Products
WHERE prod_desc LIKE '%toy%';

Challenge 2: Products WITHOUT "toy" in description

-- Find products not containing "toy" in description
SELECT prod_name, prod_desc
FROM Products
WHERE prod_desc NOT LIKE '%toy%'
ORDER BY prod_name;
 
-- Alternative handling NULLs
SELECT prod_name, prod_desc
FROM Products
WHERE (prod_desc NOT LIKE '%toy%' OR prod_desc IS NULL)
ORDER BY prod_name;

Challenge 3: Products with both "toy" AND "carrots"

-- Find products containing both "toy" and "carrots"
SELECT prod_name, prod_desc
FROM Products
WHERE prod_desc LIKE '%toy%'
  AND prod_desc LIKE '%carrots%';

Challenge 4: "toy" before "carrots" in description

-- Find products with "toy" appearing before "carrots"
SELECT prod_name, prod_desc
FROM Products
WHERE prod_desc LIKE '%toy%carrots%';

Best Practices Summary

1. Choose the Right Wildcard:

-- ✅ Use % for variable-length matching
WHERE prod_name LIKE 'Fish%';
 
-- ✅ Use _ for exact character count
WHERE prod_id LIKE 'BR__';
 
-- ✅ Use [] for character sets (where supported)
WHERE cust_name LIKE '[A-M]%';

2. Optimize for Performance:

-- ✅ Avoid leading wildcards when possible
WHERE cust_name LIKE 'Smith%';  -- Good
 
-- ✅ Combine with selective filters
WHERE cust_state = 'CA' AND cust_name LIKE '%Smith%';  -- Better
 
-- ✅ Use LIMIT for large result sets
SELECT TOP 100 * FROM Products WHERE prod_name LIKE '%toy%';

3. Handle Edge Cases:

-- ✅ Account for trailing spaces
WHERE TRIM(prod_name) LIKE 'Fish%';
 
-- ✅ Handle case sensitivity
WHERE UPPER(cust_name) LIKE 'JOHN%';
 
-- ✅ Consider NULL values
WHERE cust_email LIKE '%@gmail.com' OR cust_email IS NULL;

4. Use Appropriate Alternatives:

-- ✅ Full-text search for complex text queries
WHERE CONTAINS(prod_desc, 'wireless AND bluetooth');
 
-- ✅ Regular expressions for complex patterns
WHERE cust_email REGEXP '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$';

Calculated Fields

What Are Calculated Fields?

Calculated fields are virtual columns created on-the-fly within SQL SELECT statements. They don't exist in database tables but are computed from existing data during query execution.

Why Use Calculated Fields?

Common scenarios where calculated fields are essential:

  • Combining data: Company name + location from separate columns
  • Formatting: City, state, ZIP into mailing label format
  • Case conversion: Mixed case data to uppercase for reports
  • Calculations: Price × quantity for invoice totals
  • Aggregations: Totals, averages, and other computations

Client vs Server Processing

Best Practice: Perform data transformations on the database server rather than in client applications for better performance.

Field Concatenation

Basic Concatenation Syntax

Concatenation operators vary by DBMS:

DBMSOperatorExample
SQL Server+column1 + column2
DB2, Oracle, PostgreSQL, SQLite`
MySQL, MariaDBCONCAT()CONCAT(column1, column2)

Example: Vendor Name with Location

SQL Server

SELECT vend_name + '(' + vend_country + ')'
FROM Vendors
ORDER BY vend_name;

Oracle/PostgreSQL/SQLite

SELECT vend_name || '(' || vend_country || ')'
FROM Vendors
ORDER BY vend_name;

MySQL/MariaDB

SELECT CONCAT(vend_name, ' (', vend_country, ')')
FROM Vendors
ORDER BY vend_name;

Output:

Bear Emporium                           (USA             )
Bears R Us                              (USA             )
Doll House Inc.                         (USA             )
Fun and Games                           (England         )

Handling Padded Spaces

Many databases pad text values to column width. Use RTRIM() to remove trailing spaces:

SQL Server

SELECT RTRIM(vend_name) + ' (' + RTRIM(vend_country) + ')'
FROM Vendors
ORDER BY vend_name;

Oracle/PostgreSQL/SQLite

SELECT RTRIM(vend_name) || ' (' || RTRIM(vend_country) || ')'
FROM Vendors
ORDER BY vend_name;

Clean Output:

Bear Emporium (USA)
Bears R Us (USA)
Doll House Inc. (USA)
Fun and Games (England)

TRIM Functions

| Function | Purpose |-----|-----|----- | RTRIM() | Removes spaces from right side | LTRIM() | Removes spaces from left side | TRIM() | Removes spaces from both sides

Using Aliases

Calculated fields need names to be referenced by client applications. Aliases provide alternate names using the AS keyword.

Basic Alias Syntax

SELECT RTRIM(vend_name) + ' (' + RTRIM(vend_country) + ')'
       AS vend_title
FROM Vendors
ORDER BY vend_name;

Output:

vend_title
-----------------------------------------------------------
Bear Emporium (USA)
Bears R Us (USA)
Doll House Inc. (USA)

MySQL/MariaDB Example

SELECT CONCAT(RTRIM(vend_name), ' (', RTRIM(vend_country), ')') 
       AS vend_title
FROM Vendors
ORDER BY vend_name;

Alias Best Practices

Good Practices:

  • Use single-word aliases
  • Use descriptive names
  • Include AS keyword (even if optional)
  • Use consistent naming conventions

Avoid:

  • Multi-word aliases (create client application issues)
  • Special characters in alias names
  • Reserved SQL keywords as aliases

Other Uses for Aliases

  • Rename columns with illegal characters (spaces, special chars)
  • Expand cryptic column names for clarity
  • Resolve ambiguous column names in joins
  • Create meaningful names for complex calculations

Mathematical Calculations

Calculated fields excel at performing mathematical operations on retrieved data.

Basic Example: Order Item Totals

-- Show individual columns
SELECT prod_id, quantity, item_price
FROM OrderItems
WHERE order_num = 20008;

Output:

prod_id       quantity       item_price
----------    -----------    ---------------------
RGAN01        5              4.9900
BR03          5              11.9900
BNBG01        10             3.4900
-- Add calculated total
SELECT prod_id,
       quantity,
       item_price,
       quantity * item_price AS expanded_price
FROM OrderItems
WHERE order_num = 20008;

Output:

prod_id     quantity      item_price      expanded_price
----------  -----------   ------------    -----------------
RGAN01      5             4.9900          24.9500
BR03        5             11.9900         59.9500
BNBG01      10            3.4900          34.9000

SQL Mathematical Operators

OperatorDescriptionExample
+Additionprice + tax
-Subtractionprice - discount
*Multiplicationquantity * price
/Divisiontotal / count

Order of Operations

Use parentheses to control calculation order:

SELECT prod_id,
       quantity,
       item_price,
       (quantity * item_price) * 0.9 AS discounted_total,
       quantity * (item_price - 1.00) AS bulk_discount
FROM OrderItems;

Testing Calculations

Use SELECT without FROM to test expressions:

-- Test mathematical operations
SELECT 3 * 2;                    -- Returns: 6
 
-- Test string functions  
SELECT TRIM('   abc   ');        -- Returns: abc
 
-- Test date functions (MySQL/MariaDB)
SELECT CURDATE();                -- Returns: current date

Practical Examples

Example 1: Full Customer Address

SELECT cust_name,
       cust_address || ', ' || cust_city || ', ' || cust_state || ' ' || cust_zip
       AS full_address
FROM Customers;

Example 2: Sales Tax Calculation

SELECT prod_id,
       prod_name,
       prod_price,
       prod_price * 0.08 AS sales_tax,
       prod_price * 1.08 AS total_with_tax
FROM Products;

Example 3: Profit Margin Analysis

SELECT prod_id,
       prod_price,
       prod_cost,
       prod_price - prod_cost AS profit,
       ((prod_price - prod_cost) / prod_price) * 100 AS profit_margin_pct
FROM Products;

Challenge Solutions

Challenge 1: Column Renaming

SELECT vend_id,
       vend_name AS vname,
       vend_address AS vaddress,
       vend_city AS vcity
FROM Vendors
ORDER BY vname;  -- Can use original or alias name

Challenge 2: Sale Price Calculation

SELECT prod_id,
       prod_price,
       prod_price * 0.9 AS sale_price
FROM Products;

Advanced Techniques

Complex String Formatting

-- Create formatted product description
SELECT UPPER(RTRIM(prod_name)) || ' - ' || 
       LOWER(RTRIM(prod_desc)) || ' ($' || 
       CAST(prod_price AS VARCHAR(10)) || ')'
       AS formatted_product
FROM Products;

Conditional Calculations

-- Different discount rates based on quantity
SELECT prod_id,
       quantity,
       item_price,
       CASE 
         WHEN quantity >= 10 THEN item_price * 0.9
         WHEN quantity >= 5 THEN item_price * 0.95
         ELSE item_price
       END AS discounted_price
FROM OrderItems;

Date Calculations

-- Calculate days since order
SELECT order_num,
       order_date,
       CURRENT_DATE - order_date AS days_since_order
FROM Orders;

Data Manipulation Functions

What Are SQL Functions?

Functions are operations performed on data to facilitate conversion and manipulation. They're essential tools in your SQL toolkit for transforming and processing data efficiently.

Example: RTRIM() removes trailing spaces from strings.

The Portability Challenge

⚠️ Major Limitation: Unlike SQL statements (SELECT, INSERT, etc.), functions are highly DBMS-specific.

Function Differences Across DBMSs

Function TypeDBMS Variations
Extract substringDB2, Oracle, PostgreSQL, SQLite: SUBSTR()``<br>MariaDB, MySQL, SQL Server: SUBSTRING()
Data type conversionOracle: Multiple specific functions<br>DB2, PostgreSQL, SQL Server: CAST()``<br>MariaDB, MySQL, SQL Server: CONVERT()
Current dateDB2, PostgreSQL: CURRENT_DATE``<br>MariaDB, MySQL: CURDATE()``<br>Oracle: SYSDATE``<br>SQL Server: GETDATE()``<br>SQLite: DATE()

Should You Use Functions?

Pros:

  • Better performance (DBMS handles operations efficiently)
  • Cleaner, more readable code
  • Built-in optimizations

Cons:

  • Reduced code portability
  • DBMS lock-in
  • Migration complexity

Best Practice: If you use functions, document your code thoroughly and specify which DBMS you're targeting.

Types of SQL Functions

1. Text Functions

Manipulate strings of text (trimming, padding, case conversion)

2. Numeric Functions

Perform mathematical operations on numeric data

3. Date and Time Functions

Manipulate date/time values and extract components

4. Formatting Functions

Generate user-friendly outputs (localized dates, currencies)

5. System Functions

Return DBMS-specific information (user login, system info)

Text Manipulation Functions

Basic Example: UPPER()

SELECT vend_name, UPPER(vend_name) AS vend_name_upcase
FROM Vendors
ORDER BY vend_name;

Output:

vend_name              vend_name_upcase
Bear Emporium          BEAR EMPORIUM
Bears R Us             BEARS R US
Doll House Inc.        DOLL HOUSE INC.
Fun and Games          FUN AND GAMES

Common Text Functions

FunctionDescription
LEFT()Returns characters from left of string
LENGTH() / LEN() / DATALENGTH()Returns string length
LOWER()Converts to lowercase
LTRIM()Trims whitespace from left
RIGHT()Returns characters from right of string
RTRIM()Trims whitespace from right
SUBSTR() / SUBSTRING()Extracts part of string
SOUNDEX()Returns phonetic representation
UPPER()Converts to uppercase

SOUNDEX() Function

Purpose: Matches strings by how they sound, not exact spelling.

-- This returns no results
SELECT cust_name, cust_contact
FROM Customers
WHERE cust_contact = 'Michael Green';
 
-- This finds "Michelle Green" because it sounds similar
SELECT cust_name, cust_contact
FROM Customers
WHERE SOUNDEX(cust_contact) = SOUNDEX('Michael Green');

Output:

cust_name              cust_contact
Kids Place             Michelle Green

Note: SOUNDEX() is not supported in PostgreSQL and requires special compilation in SQLite.

Date and Time Manipulation Functions

Date/time functions are among the most important but least portable SQL functions.

Extracting Year from Date

SQL Server

SELECT order_num
FROM Orders
WHERE DATEPART(yy, order_date) = 2020;

PostgreSQL

SELECT order_num
FROM Orders
WHERE DATE_PART('year', order_date) = 2020;

Oracle

SELECT order_num
FROM Orders
WHERE EXTRACT(year FROM order_date) = 2020;

MySQL/MariaDB/DB2

SELECT order_num
FROM Orders
WHERE YEAR(order_date) = 2020;

SQLite

SELECT order_num
FROM Orders
WHERE strftime('%Y', order_date) = '2020';

Alternative: Using BETWEEN

-- Oracle example
SELECT order_num
FROM Orders
WHERE order_date BETWEEN to_date('2020-01-01', 'yyyy-mm-dd')
                     AND to_date('2020-12-31', 'yyyy-mm-dd');

Numeric Manipulation Functions

Numeric functions are the most consistent across different DBMSs.

Common Numeric Functions

FunctionDescription
ABS()Returns absolute value
COS()Returns trigonometric cosine
EXP()Returns exponential value
PI()Returns value of PI
SIN()Returns trigonometric sine
SQRT()Returns square root
TAN()Returns trigonometric tangent

Examples

-- Calculate absolute values
SELECT product_id, price, ABS(price - 10) AS price_difference
FROM Products;
 
-- Mathematical calculations
SELECT PI() AS pi_value, SQRT(16) AS square_root;

Best Practices

1. Code Documentation

-- SQL Server specific function
SELECT DATEPART(yy, order_date) AS order_year
FROM Orders;

2. Function Naming Consistency

Choose a case style and stick with it:

  • UPPER()
  • upper()
  • Upper()
  • Mixing styles ❌

3. Performance Considerations

-- Efficient: Use functions in SELECT
SELECT cust_name, UPPER(cust_name) AS name_upper
FROM Customers;
 
-- Less efficient: Functions in WHERE (may prevent index usage)
WHERE UPPER(cust_name) = 'JOHN DOE'

4. Portability Planning

  • Document DBMS dependencies
  • Consider abstraction layers
  • Plan migration strategies
  • Test across target platforms

Practical Challenges

Challenge 1: User Login Generation

Create a login combining first 2 characters of contact name + first 3 characters of city (uppercase):

SELECT 
    cust_id,
    cust_name,
    UPPER(
        CONCAT(
            LEFT(cust_contact, 2),
            LEFT(cust_city, 3)
        )
    ) AS user_login
FROM Customers;

Challenge 2: Date Range Filtering

Find all orders from January 2020:

-- SQL Server
SELECT order_num, order_date
FROM Orders
WHERE DATEPART(mm, order_date) = 1 
  AND DATEPART(yy, order_date) = 2020
ORDER BY order_date;
 
-- MySQL/MariaDB
SELECT order_num, order_date
FROM Orders
WHERE MONTH(order_date) = 1 
  AND YEAR(order_date) = 2020
ORDER BY order_date;

Aggregate Functions

What Are Aggregate Functions?

Aggregate functions operate on a set of rows to calculate and return a single value. They're essential for data analysis and reporting, allowing you to:

  • Count rows in a table
  • Calculate sums, averages, minimums, and maximums
  • Summarize data efficiently without retrieving all records

The Five Core Aggregate Functions

FunctionDescription
AVG()Returns a column's average value
COUNT()Returns the number of rows in a column
MAX()Returns a column's highest value
MIN()Returns a column's lowest value
SUM()Returns the sum of a column's values

AVG() Function

Returns the average value of a numeric column.

Basic Usage

SELECT AVG(prod_price) AS avg_price
FROM Products;

Output: 6.823333

With WHERE Clause

SELECT AVG(prod_price) AS avg_price
FROM Products
WHERE vend_id = 'DLL01';

Output: 3.8650

Important: AVG() only works with numeric columns and ignores NULL values.

COUNT() Function

Counts rows in a table or column.

Two Usage Patterns

COUNT(*) - Count All Rows

SELECT COUNT(*) AS num_cust
FROM Customers;

Output: 5

COUNT(column) - Count Non-NULL Values

SELECT COUNT(cust_email) AS num_cust
FROM Customers;

Output: 3 (only customers with email addresses)

Note: COUNT(*) includes NULL values, COUNT(column) excludes them.

MAX() Function

Returns the highest value in a column.

SELECT MAX(prod_price) AS max_price
FROM Products;

Output: 11.9900

Works with Text Data

MAX() can also work with textual columns, returning the value that would appear last when sorted alphabetically.

Note: NULL values are ignored by MAX().

MIN() Function

Returns the lowest value in a column.

SELECT MIN(prod_price) AS min_price
FROM Products;

Output: 3.4900

** Works with Text Data**

Like MAX(), MIN() can work with textual columns, returning the value that would appear first when sorted alphabetically.

Note: NULL values are ignored by MIN().

** SUM() Function**

Returns the total sum of values in a numeric column.

Basic Sum

SELECT SUM(quantity) AS items_ordered
FROM OrderItems
WHERE order_num = 20005;

Output: 200

Calculated Sum

SELECT SUM(item_price * quantity) AS total_price
FROM OrderItems
WHERE order_num = 20005;

Output: 1648.0000

Note: SUM() can perform calculations on multiple columns and ignores NULL values.

Using DISTINCT with Aggregates

Aggregate functions can work with unique values only by using the DISTINCT keyword.

Default Behavior (ALL)

SELECT AVG(prod_price) AS avg_price
FROM Products
WHERE vend_id = 'DLL01';

With DISTINCT

SELECT AVG(DISTINCT prod_price) AS avg_price
FROM Products
WHERE vend_id = 'DLL01';

Output: 4.2400 (higher because duplicate lower prices are excluded)

DISTINCT Rules

  • ✅ Can use with COUNT(column)
  • ❌ Cannot use with COUNT(*)
  • ⚠️ No practical benefit with MIN() and MAX()

Combining Multiple Aggregate Functions

You can use multiple aggregate functions in a single query:

SELECT COUNT(*) AS num_items,
       MIN(prod_price) AS price_min,
       MAX(prod_price) AS price_max,
       AVG(prod_price) AS price_avg
FROM Products;

Output:

num_items | price_min | price_max | price_avg
----------|-----------|-----------|----------
9         | 3.4900    | 11.9900   | 6.823333

Best Practices

Alias Naming

  • Use descriptive aliases for aggregate results
  • Avoid using actual column names as aliases
  • This prevents potential conflicts and obscure error messages

NULL Handling

  • All aggregate functions ignore NULL values (except COUNT(*))
  • COUNT(*) includes rows with NULL values
  • COUNT(column) excludes rows with NULL values

Performance Tips

  • Use aggregate functions instead of retrieving all data for analysis
  • Combine multiple aggregates in one query when possible
  • Use WHERE clauses to filter data before aggregation

Common Use Cases

Data Analysis

-- Sales summary
SELECT 
    COUNT(*) AS total_orders,
    SUM(order_total) AS total_revenue,
    AVG(order_total) AS avg_order_value,
    MAX(order_total) AS largest_order
FROM Orders
WHERE order_date >= '2024-01-01';

Quality Checks

-- Data completeness check
SELECT 
    COUNT(*) AS total_records,
    COUNT(email) AS records_with_email,
    COUNT(phone) AS records_with_phone
FROM Customers;

Reporting

-- Product performance by category
SELECT 
    category,
    COUNT(*) AS product_count,
    AVG(price) AS avg_price,
    MIN(price) AS min_price,
    MAX(price) AS max_price
FROM Products
GROUP BY category;

Data Grouping

Understanding Data Grouping

Data grouping allows you to divide data into logical sets and perform aggregate calculations on each group, rather than on the entire dataset.

Why Group Data?

Instead of calculating aggregates for all data:

-- Count all products from vendor DLL01
SELECT COUNT(*) AS num_prods
FROM Products
WHERE vend_id = 'DLL01';

Output: 4

You might want to:

  • Count products offered by each vendor
  • Find vendors offering only one product
  • Identify vendors with more than 10 products

This is where GROUP BY becomes essential.

Creating Groups with GROUP BY

Basic GROUP BY Syntax

SELECT vend_id, COUNT(*) AS num_prods
FROM Products
GROUP BY vend_id;

Output:

vend_id  num_prods
-------  ---------
BRS01    3
DLL01    4
FNG01    2

Analysis: The query groups products by vendor ID and counts products in each group automatically.

GROUP BY Rules

1. Multiple Columns Allowed

SELECT vend_id, prod_type, COUNT(*) AS num_prods
FROM Products
GROUP BY vend_id, prod_type;

2. Nested Grouping

  • Data summarized at the last specified group level
  • All columns evaluated together when grouping

3. Column Requirements

  • Every non-aggregate column in SELECT must be in GROUP BY
  • Use actual column names or expressions, not aliases

4. Expression Consistency

-- If using expression in SELECT...
SELECT UPPER(vend_id), COUNT(*)
FROM Products
GROUP BY UPPER(vend_id);  -- Must use same expression in GROUP BY

5. Data Type Limitations

  • Most implementations don't allow variable-length types (TEXT, MEMO)

6. NULL Handling

  • NULL values grouped together
  • Multiple NULLs form one group

7. Clause Order

SELECT columns
FROM table
WHERE conditions      -- Before GROUP BY
GROUP BY columns      -- After WHERE
ORDER BY columns      -- After GROUP BY

Filtering Groups with HAVING

WHERE vs HAVING

WHEREHAVING
Filters rowsFilters groups
Applied before groupingApplied after grouping
Works on individual recordsWorks on aggregate results

Basic HAVING Example

SELECT cust_id, COUNT(*) AS orders
FROM Orders
GROUP BY cust_id
HAVING COUNT(*) >= 2;

Output:

cust_id    orders
---------- -----------
1000000001 2

Analysis: Returns only customers with 2 or more orders.

HAVING Supports All WHERE Operators

All WHERE clause techniques work with HAVING:

  • Comparison operators (>=, <, =, etc.)
  • Wildcard patterns (LIKE)
  • Multiple conditions (AND, OR)
  • IN and BETWEEN operators

Combining WHERE and HAVING

Complex Filtering Example

SELECT vend_id, COUNT(*) AS num_prods
FROM Products
WHERE prod_price >= 4        -- Filter rows first
GROUP BY vend_id
HAVING COUNT(*) >= 2;        -- Filter groups second

Output:

vend_id  num_prods
-------  -----------
BRS01    3
FNG01    2

Analysis:

  1. WHERE filters products with price ≥ 4
  2. GROUP BY groups remaining products by vendor
  3. HAVING keeps only vendors with ≥ 2 products

Without WHERE Clause

SELECT vend_id, COUNT(*) AS num_prods
FROM Products
GROUP BY vend_id
HAVING COUNT(*) >= 2;

Output:

vend_id  num_prods
-------  -----------
BRS01    3
DLL01    4          -- Included because it has 4 products total
FNG01    2

GROUP BY vs ORDER BY

Key Differences

GROUP BYORDER BY
Groups rowsSorts output
Output may not be in group orderAlways sorts output
Only selected columns/expressionsAny columns (even non-selected)
Required with aggregatesOptional

Best Practice: Always Use ORDER BY

-- Good: Explicit sorting
SELECT order_num, COUNT(*) AS items
FROM OrderItems
GROUP BY order_num
HAVING COUNT(*) >= 3
ORDER BY items, order_num;

Output:

order_num  items
---------  -----
20006      3
20009      3
20007      5
20008      5

Important: Never rely on GROUP BY to sort data. Always include ORDER BY for guaranteed sort order.

SELECT Clause Order

Required Sequence

| Clause | Description | Required |-----|----- | SELECT | Columns or expressions | Yes | FROM | Table to retrieve from | Only if selecting from table | WHERE | Row-level filtering | No | GROUP BY | Group specification | Only with aggregates by group | HAVING | Group-level filtering | No | ORDER BY | Output sort order | No

Complete Example

SELECT vend_id, AVG(prod_price) AS avg_price
FROM Products
WHERE prod_price > 0
GROUP BY vend_id
HAVING AVG(prod_price) >= 5
ORDER BY avg_price DESC;

Practical Examples

Example 1: Sales Analysis by Region

SELECT region, 
       COUNT(*) AS total_orders,
       SUM(order_total) AS total_sales,
       AVG(order_total) AS avg_order_value
FROM Orders
WHERE order_date >= '2024-01-01'
GROUP BY region
HAVING SUM(order_total) > 10000
ORDER BY total_sales DESC;

Example 2: Product Performance by Category

SELECT category,
       COUNT(*) AS product_count,
       MIN(price) AS min_price,
       MAX(price) AS max_price,
       AVG(price) AS avg_price
FROM Products
GROUP BY category
HAVING COUNT(*) >= 5
ORDER BY avg_price DESC;

Example 3: Customer Segmentation

SELECT 
    CASE 
        WHEN COUNT(*) >= 10 THEN 'High Value'
        WHEN COUNT(*) >= 5 THEN 'Medium Value'
        ELSE 'Low Value'
    END AS customer_segment,
    COUNT(*) AS customer_count
FROM (
    SELECT cust_id, COUNT(*) AS order_count
    FROM Orders
    GROUP BY cust_id
) customer_orders
GROUP BY customer_segment
ORDER BY customer_count DESC;

Challenge Solutions

Challenge 1: Order Lines Count

SELECT order_num, COUNT(*) AS order_lines
FROM OrderItems
GROUP BY order_num
ORDER BY order_lines;

Challenge 2: Cheapest Item by Vendor

SELECT vend_id, MIN(prod_price) AS cheapest_item
FROM Products
GROUP BY vend_id
ORDER BY cheapest_item;

Challenge 3: Large Orders (≥100 items)

SELECT order_num
FROM OrderItems
GROUP BY order_num
HAVING SUM(quantity) >= 100;

Challenge 4: High-Value Orders (≥$1000)

SELECT order_num
FROM OrderItems
GROUP BY order_num
HAVING SUM(item_price * quantity) >= 1000
ORDER BY order_num;

Challenge 5: What's Wrong?

-- INCORRECT:
SELECT order_num, COUNT(*) AS items
FROM OrderItems
GROUP BY items          -- ❌ Can't group by alias
HAVING COUNT(*) >= 3
ORDER BY items, order_num;
 
-- CORRECT:
SELECT order_num, COUNT(*) AS items
FROM OrderItems
GROUP BY order_num      -- ✅ Group by actual column
HAVING COUNT(*) >= 3
ORDER BY items, order_num;

Problem: You cannot use aliases in GROUP BY clause.

Advanced Grouping Techniques

Multiple Grouping Levels

SELECT vend_id, 
       EXTRACT(YEAR FROM order_date) AS order_year,
       COUNT(*) AS orders,
       SUM(total_amount) AS yearly_sales
FROM Orders
GROUP BY vend_id, EXTRACT(YEAR FROM order_date)
ORDER BY vend_id, order_year;

Conditional Aggregation

SELECT vend_id,
       COUNT(*) AS total_products,
       COUNT(CASE WHEN prod_price > 10 THEN 1 END) AS expensive_products,
       COUNT(CASE WHEN prod_price <= 10 THEN 1 END) AS affordable_products
FROM Products
GROUP BY vend_id;

Subqueries

Understanding Subqueries

Subqueries are SQL queries embedded inside other queries. They enable you to break down complex data retrieval into logical steps and create more flexible, powerful SQL statements.

Why Use Subqueries?

Consider this scenario: Find all customers who ordered item 'RGAN01'.

Traditional approach (3 separate queries):

  1. Find orders containing 'RGAN01'
  2. Find customer IDs for those orders
  3. Get customer details for those IDs

Subquery approach: Combine all steps into one statement.

Filtering with Subqueries

Step-by-Step Example

Step 1: Find Orders with Specific Product

SELECT order_num
FROM OrderItems
WHERE prod_id = 'RGAN01';

Output:

order_num
-----------
20007
20008

Step 2: Find Customers for Those Orders

SELECT cust_id
FROM Orders
WHERE order_num IN (20007, 20008);

Output:

cust_id
----------
1000000004
1000000005

Step 3: Combine with Subquery

SELECT cust_id
FROM Orders
WHERE order_num IN (SELECT order_num
                    FROM OrderItems
                    WHERE prod_id = 'RGAN01');

Output:

cust_id
----------
1000000004
1000000005

Complete Nested Subquery

SELECT cust_name, cust_contact
FROM Customers
WHERE cust_id IN (SELECT cust_id
                  FROM Orders
                  WHERE order_num IN (SELECT order_num
                                      FROM OrderItems
                                      WHERE prod_id = 'RGAN01'));

Output:

cust_name                         cust_contact
-----------------------------     --------------------
Fun4All                           Denise L. Stephens
The Toy Store                     Kim Howard

How Subqueries Execute

Processing Order: Always innermost to outermost

  1. Innermost: SELECT order_num FROM OrderItems WHERE prod_id = 'RGAN01'

  2. Returns: 20007, 20008

  3. Middle: SELECT cust_id FROM Orders WHERE order_num IN (20007, 20008)

  4. Returns: 1000000004, 1000000005

  5. Outermost: SELECT cust_name, cust_contact FROM Customers WHERE cust_id IN (1000000004, 1000000005)

  6. Returns: Customer details

** Subqueries as Calculated Fields**

Counting Related Records

Goal: Show each customer with their total number of orders.

SELECT cust_name,
       cust_state,
       (SELECT COUNT(*)
        FROM Orders
        WHERE Orders.cust_id = Customers.cust_id) AS orders
FROM Customers
ORDER BY cust_name;

Output:

cust_name                    cust_state    orders
-------------------------    ----------    ------
Fun4All                      IN            1
Fun4All                      AZ            1
Kids Place                   OH            0
The Toy Store                IL            1
Village Toys                 MI            2

Fully Qualified Column Names

Critical: Use table.column syntax when column names exist in multiple tables.

** ✅ Correct (Fully Qualified)**

WHERE Orders.cust_id = Customers.cust_id

❌ Incorrect (Ambiguous)

WHERE cust_id = cust_id  -- Compares column to itself!

Wrong Result:

cust_name                    cust_state    orders
-------------------------    ----------    ------
Fun4All                      IN            5
Fun4All                      AZ            5
Kids Place                   OH            5
The Toy Store                IL            5
Village Toys                 MI            5

Subquery Rules and Best Practices

Essential Rules

  1. Single Column Only: Subqueries in WHERE clauses can only return one column
  2. Proper Nesting: No limit on nesting depth, but performance degrades
  3. Fully Qualify: Always use table.column when ambiguity exists
  4. Parentheses Required: Subqueries must be enclosed in parentheses

Formatting Best Practices

-- Good: Well-formatted subquery
SELECT cust_name, cust_contact
FROM Customers
WHERE cust_id IN (
    SELECT cust_id
    FROM Orders
    WHERE order_num IN (
        SELECT order_num
        FROM OrderItems
        WHERE prod_id = 'RGAN01'
    )
);

Performance Considerations

  • Subqueries: Can be less efficient for complex operations
  • JOINs: Often more efficient alternative (covered in next lesson)
  • Testing: Always test performance with real data volumes

Practical Examples

Example 1: High-Value Product Customers

-- Find customers who bought expensive items
SELECT DISTINCT cust_name
FROM Customers
WHERE cust_id IN (
    SELECT cust_id
    FROM Orders
    WHERE order_num IN (
        SELECT order_num
        FROM OrderItems
        WHERE item_price >= 10
    )
);

Example 2: Product Sales Summary

-- Show each product with total quantity sold
SELECT prod_name,
       (SELECT SUM(quantity)
        FROM OrderItems
        WHERE OrderItems.prod_id = Products.prod_id) AS total_sold
FROM Products
ORDER BY total_sold DESC;

Example 3: Customer Spending Analysis

-- Calculate total spending per customer
SELECT cust_name,
       (SELECT SUM(item_price * quantity)
        FROM OrderItems oi
        JOIN Orders o ON oi.order_num = o.order_num
        WHERE o.cust_id = c.cust_id) AS total_spent
FROM Customers c
ORDER BY total_spent DESC;

Challenge Solutions

Challenge 1: Customers with High-Value Items

SELECT DISTINCT cust_name
FROM Customers
WHERE cust_id IN (
    SELECT cust_id
    FROM Orders
    WHERE order_num IN (
        SELECT order_num
        FROM OrderItems
        WHERE item_price >= 10
    )
);

Challenge 2: BR01 Order Dates

SELECT cust_id, order_date
FROM Orders
WHERE order_num IN (
    SELECT order_num
    FROM OrderItems
    WHERE prod_id = 'BR01'
)
ORDER BY order_date;

Challenge 3: BR01 Customer Emails

SELECT cust_email
FROM Customers
WHERE cust_id IN (
    SELECT cust_id
    FROM Orders
    WHERE order_num IN (
        SELECT order_num
        FROM OrderItems
        WHERE prod_id = 'BR01'
    )
);

Challenge 4: Customer Order Totals

SELECT cust_id,
       (SELECT SUM(item_price * quantity)
        FROM OrderItems oi
        JOIN Orders o ON oi.order_num = o.order_num
        WHERE o.cust_id = Orders.cust_id) AS total_ordered
FROM Orders
GROUP BY cust_id
ORDER BY total_ordered DESC;

Challenge 5: Product Sales Quantities

SELECT prod_name,
       (SELECT COALESCE(SUM(quantity), 0)
        FROM OrderItems
        WHERE OrderItems.prod_id = Products.prod_id) AS quant_sold
FROM Products
ORDER BY quant_sold DESC;

Advanced Subquery Techniques

Correlated Subqueries

-- Find customers with above-average order values
SELECT cust_name
FROM Customers c
WHERE (
    SELECT AVG(item_price * quantity)
    FROM OrderItems oi
    JOIN Orders o ON oi.order_num = o.order_num
    WHERE o.cust_id = c.cust_id
) > (
    SELECT AVG(item_price * quantity)
    FROM OrderItems
);

EXISTS vs IN

-- Using EXISTS (often more efficient)
SELECT cust_name
FROM Customers c
WHERE EXISTS (
    SELECT 1
    FROM Orders o
    WHERE o.cust_id = c.cust_id
    AND o.order_date >= '2024-01-01'
);
 
-- Using IN (traditional approach)
SELECT cust_name
FROM Customers
WHERE cust_id IN (
    SELECT cust_id
    FROM Orders
    WHERE order_date >= '2024-01-01'
);

Subqueries in SELECT, FROM, and HAVING

-- Subquery in SELECT (calculated field)
SELECT prod_name,
       (SELECT COUNT(*) FROM OrderItems WHERE prod_id = p.prod_id) AS times_ordered
FROM Products p;
 
-- Subquery in FROM (derived table)
SELECT avg_price_category, COUNT(*) as product_count
FROM (
    SELECT 
        CASE 
            WHEN prod_price < 5 THEN 'Low'
            WHEN prod_price < 15 THEN 'Medium'
            ELSE 'High'
        END AS avg_price_category
    FROM Products
) price_categories
GROUP BY avg_price_category;
 
-- Subquery in HAVING
SELECT vend_id, COUNT(*) as product_count
FROM Products
GROUP BY vend_id
HAVING COUNT(*) > (
    SELECT AVG(product_count)
    FROM (
        SELECT COUNT(*) as product_count
        FROM Products
        GROUP BY vend_id
    ) vendor_counts
);

Table Joins

Understanding Relational Tables

Why Split Data Across Tables?

Consider a product database. Instead of storing all information in one table:

❌ Poor Design (Single Table)

Products Table:
prod_id | prod_name | prod_price | vend_name | vend_address | vend_contact
--------|-----------|------------|-----------|--------------|-------------
P001    | Toy Bear  | 9.99       | Bears R Us| 123 Main St  | John Smith
P002    | Toy Car   | 5.99       | Bears R Us| 123 Main St  | John Smith

Problems:

  • Wasted space: Vendor info repeated for each product
  • Update anomalies: Change vendor address in multiple places
  • Data inconsistency: Risk of typos in repeated data

✅ Good Design (Related Tables)

Vendors Table:
vend_id | vend_name | vend_address | vend_contact
--------|-----------|--------------|-------------
V001    | Bears R Us| 123 Main St  | John Smith
 
Products Table:
prod_id | prod_name | prod_price | vend_id
--------|-----------|------------|--------
P001    | Toy Bear  | 9.99       | V001
P002    | Toy Car   | 5.99       | V001

Benefits:

  • No data duplication
  • Single point of update
  • Data consistency guaranteed
  • Better scalability

Why Use Joins?

Challenge: Data is in multiple tables, but you need it in one result set.

Solution: Joins associate tables within a SELECT statement to return combined data.

Key Point: Joins are created by the DBMS on-the-fly during query execution - they don't exist as physical entities.

Creating Basic Joins

Simple Join Syntax

SELECT vend_name, prod_name, prod_price
FROM Vendors, Products
WHERE Vendors.vend_id = Products.vend_id;

Output:

vend_name               prod_name               prod_price
--------------------    --------------------    ----------
Doll House Inc.         Fish bean bag toy       3.4900
Doll House Inc.         Bird bean bag toy       3.4900
Bears R Us              8 inch teddy bear       5.9900
Bears R Us              12 inch teddy bear      8.9900
Fun and Games           King doll               9.4900

Key Components

  1. FROM clause: Lists all tables to join
  2. WHERE clause: Defines the relationship between tables
  3. Fully qualified names: table.column format required for ambiguous columns

The Critical WHERE Clause

Without WHERE (Cartesian Product)

-- ❌ DANGEROUS: Missing WHERE clause
SELECT vend_name, prod_name, prod_price
FROM Vendors, Products;

Result: Every vendor paired with every product (54 rows instead of 9!)

vend_name          prod_name                     prod_price
----------------   ----------------------------  ----------
Bears R Us         8 inch teddy bear             5.99
Bears R Us         Fish bean bag toy             3.49  -- Wrong!
Bears R Us         King doll                     9.49  -- Wrong!
Bear Emporium      8 inch teddy bear             5.99  -- Wrong!
Bear Emporium      Fish bean bag toy             3.49  -- Wrong!
...

With Proper WHERE Clause

-- ✅ CORRECT: Proper join condition
SELECT vend_name, prod_name, prod_price
FROM Vendors, Products
WHERE Vendors.vend_id = Products.vend_id;

Critical Rule: Always include WHERE clauses in joins to avoid Cartesian products.

Inner Joins (ANSI Syntax)

Modern INNER JOIN Syntax

SELECT vend_name, prod_name, prod_price
FROM Vendors
INNER JOIN Products ON Vendors.vend_id = Products.vend_id;

Advantages of ANSI Syntax:

  • Clearer intent: Join logic separated from filtering
  • Less error-prone: Harder to forget join conditions
  • Industry standard: Preferred by SQL purists

Syntax Comparison

TraditionalANSI INNER JOIN
FROM table1, table2 WHERE table1.id = table2.idFROM table1 INNER JOIN table2 ON table1.id = table2.id
Join condition in WHEREJoin condition in ON
Mixing joins and filtersClear separation

Joining Multiple Tables

Three-Table Join Example

SELECT prod_name, vend_name, prod_price, quantity
FROM OrderItems, Products, Vendors
WHERE Products.vend_id = Vendors.vend_id
  AND OrderItems.prod_id = Products.prod_id
  AND order_num = 20007;

Output:

prod_name            vend_name         prod_price  quantity
---------------      -------------     ----------  --------
18 inch teddy bear   Bears R Us        11.9900     50
Fish bean bag toy    Doll House Inc.   3.4900      100
Bird bean bag toy    Doll House Inc.   3.4900      100

ANSI Syntax for Multiple Tables

SELECT prod_name, vend_name, prod_price, quantity
FROM OrderItems
INNER JOIN Products ON OrderItems.prod_id = Products.prod_id
INNER JOIN Vendors ON Products.vend_id = Vendors.vend_id
WHERE order_num = 20007;

Joins vs Subqueries

Subquery Approach (from Lesson 11)

-- Find customers who ordered product RGAN01
SELECT cust_name, cust_contact
FROM Customers
WHERE cust_id IN (
    SELECT cust_id
    FROM Orders
    WHERE order_num IN (
        SELECT order_num
        FROM OrderItems
        WHERE prod_id = 'RGAN01'
    )
);

Join Approach (More Efficient)

-- Same result using joins
SELECT cust_name, cust_contact
FROM Customers, Orders, OrderItems
WHERE Customers.cust_id = Orders.cust_id
  AND OrderItems.order_num = Orders.order_num
  AND prod_id = 'RGAN01';

Output:

cust_name                        cust_contact
-----------------------------    --------------------
Fun4All                          Denise L. Stephens
The Toy Store                    Kim Howard

Performance Comparison

ApproachProsCons
SubqueriesEasier to understandOften slower, more resource-intensive
JoinsUsually faster, more efficientCan be more complex to write

Best Practices

1. Always Use Fully Qualified Column Names

-- ✅ Good: Clear and unambiguous
WHERE Customers.cust_id = Orders.cust_id
 
-- ❌ Bad: Ambiguous, may cause errors
WHERE cust_id = cust_id

2. Performance Considerations

  • Join only necessary tables
  • Use appropriate indexes on join columns
  • Test both joins and subqueries for your specific use case
  • Monitor performance with large datasets

3. Formatting for Readability

-- ✅ Well-formatted join
SELECT c.cust_name, 
       o.order_num, 
       oi.prod_id
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id
INNER JOIN OrderItems oi ON o.order_num = oi.order_num
WHERE oi.prod_id = 'RGAN01';

Challenge Solutions

Challenge 1: Customer Orders (Both Syntaxes)

Traditional Syntax:

SELECT cust_name, order_num
FROM Customers, Orders
WHERE Customers.cust_id = Orders.cust_id
ORDER BY cust_name, order_num;

ANSI Syntax:

SELECT cust_name, order_num
FROM Customers
INNER JOIN Orders ON Customers.cust_id = Orders.cust_id
ORDER BY cust_name, order_num;

Challenge 2: Customer Orders with Totals

SELECT c.cust_name, 
       o.order_num,
       SUM(oi.item_price * oi.quantity) AS OrderTotal
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id
INNER JOIN OrderItems oi ON o.order_num = oi.order_num
GROUP BY c.cust_name, o.order_num
ORDER BY c.cust_name, o.order_num;

Challenge 3: BR01 Order Dates (Join Version)

SELECT o.order_date
FROM Orders o, OrderItems oi
WHERE o.order_num = oi.order_num
  AND oi.prod_id = 'BR01'
ORDER BY o.order_date;

Challenge 4: BR01 Customer Details (ANSI Syntax)

SELECT c.cust_email
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id
INNER JOIN OrderItems oi ON o.order_num = oi.order_num
WHERE oi.prod_id = 'BR01';

Challenge 5: High-Value Customers

SELECT c.cust_name,
       SUM(oi.item_price * oi.quantity) AS total_price
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id
INNER JOIN OrderItems oi ON o.order_num = oi.order_num
GROUP BY c.cust_name
HAVING SUM(oi.item_price * oi.quantity) >= 1000
ORDER BY c.cust_name;

Advanced Join Concepts

Table Aliases for Cleaner Code

SELECT c.cust_name, o.order_num, p.prod_name
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id
INNER JOIN OrderItems oi ON o.order_num = oi.order_num
INNER JOIN Products p ON oi.prod_id = p.prod_id;

Self-Joins (Preview)

-- Find employees and their managers
SELECT e.emp_name AS employee, m.emp_name AS manager
FROM Employees e
INNER JOIN Employees m ON e.manager_id = m.emp_id;

Advanced Joins

Table Aliases

Why Use Table Aliases?

Two primary reasons:

  1. Shorten SQL syntax for better readability
  2. Enable multiple uses of the same table in one query

Basic Table Alias Syntax

-- Without aliases (verbose)
SELECT Customers.cust_name, Orders.order_num
FROM Customers, Orders
WHERE Customers.cust_id = Orders.cust_id;
 
-- With aliases (concise)
SELECT c.cust_name, o.order_num
FROM Customers AS c, Orders AS o
WHERE c.cust_id = o.cust_id;

Complete Example with Multiple Tables

SELECT c.cust_name, c.cust_contact
FROM Customers AS c, Orders AS o, OrderItems AS oi
WHERE c.cust_id = o.cust_id
  AND oi.order_num = o.order_num
  AND prod_id = 'RGAN01';

Important Notes

  • Oracle: Don't use AS keyword (Customers c instead of Customers AS c)
  • Scope: Table aliases only exist during query execution
  • Usage: Can be used in SELECT, WHERE, ORDER BY, and other clauses

Self Joins

Problem Scenario

Find all customers who work for the same company as Jim Jones.

Subquery Approach

SELECT cust_id, cust_name, cust_contact
FROM Customers
WHERE cust_name = (SELECT cust_name
                   FROM Customers
                   WHERE cust_contact = 'Jim Jones');

Self Join Approach (More Efficient)

SELECT c1.cust_id, c1.cust_name, c1.cust_contact
FROM Customers AS c1, Customers AS c2
WHERE c1.cust_name = c2.cust_name
  AND c2.cust_contact = 'Jim Jones';

Output:

cust_id       cust_name      cust_contact
-------       -----------    --------------
1000000003    Fun4All        Jim Jones
1000000004    Fun4All        Denise L. Stephens

How Self Joins Work

  1. Same table used twice with different aliases (c1, c2)
  2. c2 finds Jim Jones's company
  3. c1 finds all customers from that company
  4. JOIN condition matches company names

Performance Advantage

Tip: Self joins are often processed faster than subqueries by most DBMSs.

Natural Joins

Purpose

Eliminate duplicate columns that appear in multiple joined tables.

Standard Join (with duplicates)

-- This would return cust_id from both tables
SELECT *
FROM Customers c, Orders o
WHERE c.cust_id = o.cust_id;

Natural Join (no duplicates)

SELECT c.*, o.order_num, o.order_date,
       oi.prod_id, oi.quantity, oi.item_price
FROM Customers AS c, Orders AS o, OrderItems AS oi
WHERE c.cust_id = o.cust_id
  AND oi.order_num = o.order_num
  AND prod_id = 'RGAN01';

Key Points:

  • Use * for one table only
  • Explicitly list columns from other tables
  • Prevents duplicate columns in results

Note: Most inner joins you create are actually natural joins.

Outer Joins

Purpose

Include rows that have no related rows in the other table.

Use Cases

  • Count orders per customer (including customers with no orders)
  • List products with quantities (including unordered products)
  • Calculate averages including customers who haven't purchased

Inner Join vs Outer Join

Inner Join (Excludes Unmatched Rows)

SELECT c.cust_id, o.order_num
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id;

Output:

cust_id       order_num
----------    ---------
1000000001    20005
1000000001    20009
1000000003    20006
1000000004    20007
1000000005    20008

Left Outer Join (Includes All Left Table Rows)

SELECT c.cust_id, o.order_num
FROM Customers c
LEFT OUTER JOIN Orders o ON c.cust_id = o.cust_id;

Output:

cust_id       order_num
----------    ---------
1000000001    20005
1000000001    20009
1000000002    NULL      -- Customer with no orders
1000000003    20006
1000000004    20007
1000000005    20008

Types of Outer Joins

1. LEFT OUTER JOIN

-- Include ALL rows from left table (Customers)
SELECT c.cust_id, o.order_num
FROM Customers c
LEFT OUTER JOIN Orders o ON c.cust_id = o.cust_id;

2. RIGHT OUTER JOIN

-- Include ALL rows from right table (Orders)
SELECT c.cust_id, o.order_num
FROM Customers c
RIGHT OUTER JOIN Orders o ON c.cust_id = o.cust_id;

3. FULL OUTER JOIN

-- Include ALL rows from BOTH tables
SELECT c.cust_id, o.order_num
FROM Customers c
FULL OUTER JOIN Orders o ON c.cust_id = o.cust_id;

DBMS Support

Join TypeMySQL/MariaDBPostgreSQLSQL ServerOracleSQLite
LEFT OUTER
RIGHT OUTER
FULL OUTER

Converting Between Left and Right Joins

-- These are equivalent:
-- LEFT OUTER JOIN
SELECT c.cust_id, o.order_num
FROM Customers c
LEFT OUTER JOIN Orders o ON c.cust_id = o.cust_id;
 
-- RIGHT OUTER JOIN (tables reversed)
SELECT c.cust_id, o.order_num
FROM Orders o
RIGHT OUTER JOIN Customers c ON c.cust_id = o.cust_id;

Joins with Aggregate Functions

Count Orders per Customer (Inner Join)

SELECT c.cust_id,
       COUNT(o.order_num) AS num_ord
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id
GROUP BY c.cust_id;

Output:

cust_id       num_ord
----------    --------
1000000001    2
1000000003    1
1000000004    1
1000000005    1

Count Orders per Customer (Including Zero Orders)

SELECT c.cust_id,
       COUNT(o.order_num) AS num_ord
FROM Customers c
LEFT OUTER JOIN Orders o ON c.cust_id = o.cust_id
GROUP BY c.cust_id;

Output:

cust_id       num_ord
----------    -------
1000000001    2
1000000002    0        -- Customer with no orders included
1000000003    1
1000000004    1
1000000005    1

Advanced Aggregate Examples

Total Sales per Customer

SELECT c.cust_name,
       COALESCE(SUM(oi.item_price * oi.quantity), 0) AS total_sales
FROM Customers c
LEFT OUTER JOIN Orders o ON c.cust_id = o.cust_id
LEFT OUTER JOIN OrderItems oi ON o.order_num = oi.order_num
GROUP BY c.cust_id, c.cust_name
ORDER BY total_sales DESC;

Product Performance Analysis

SELECT p.prod_name,
       COUNT(oi.order_num) AS times_ordered,
       COALESCE(SUM(oi.quantity), 0) AS total_quantity
FROM Products p
LEFT OUTER JOIN OrderItems oi ON p.prod_id = oi.prod_id
GROUP BY p.prod_id, p.prod_name
ORDER BY times_ordered DESC;

Challenge Solutions

Challenge 1: Customer Orders (Inner Join)

SELECT c.cust_name, o.order_num
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id
ORDER BY c.cust_name;

Challenge 2: All Customers (Including No Orders)

SELECT c.cust_name, o.order_num
FROM Customers c
LEFT OUTER JOIN Orders o ON c.cust_id = o.cust_id
ORDER BY c.cust_name;

Challenge 3: Products and Order Numbers

SELECT p.prod_name, oi.order_num
FROM Products p
LEFT OUTER JOIN OrderItems oi ON p.prod_id = oi.prod_id
ORDER BY p.prod_name;

Challenge 4: Product Order Counts

SELECT p.prod_name,
       COUNT(oi.order_num) AS order_count
FROM Products p
LEFT OUTER JOIN OrderItems oi ON p.prod_id = oi.prod_id
GROUP BY p.prod_id, p.prod_name
ORDER BY order_count DESC;

Challenge 5: Vendor Product Counts

SELECT v.vend_id,
       COUNT(p.prod_id) AS product_count
FROM Vendors v
LEFT OUTER JOIN Products p ON v.vend_id = p.vend_id
GROUP BY v.vend_id
ORDER BY product_count DESC;

Best Practices and Key Points

1. Choose the Right Join Type

  • INNER JOIN: When you only want matching rows
  • LEFT/RIGHT OUTER JOIN: When you need all rows from one table
  • FULL OUTER JOIN: When you need all rows from both tables

2. Always Provide Join Conditions

-- ❌ Wrong: Creates Cartesian product
SELECT * FROM Customers, Orders;
 
-- ✅ Correct: Proper join condition
SELECT * FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id;

3. Test Joins Separately

-- Test each join individually before combining
-- 1. First join
SELECT * FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id;
 
-- 2. Add second join
SELECT * FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id
INNER JOIN OrderItems oi ON o.order_num = oi.order_num;

4. Use Meaningful Aliases

-- ✅ Good: Clear, meaningful aliases
SELECT cust.name, ord.date, item.price
FROM Customers cust
INNER JOIN Orders ord ON cust.id = ord.customer_id
INNER JOIN OrderItems item ON ord.num = item.order_num;

5. Handle NULLs in Outer Joins

-- Use COALESCE for better NULL handling
SELECT c.cust_name,
       COALESCE(COUNT(o.order_num), 0) AS order_count
FROM Customers c
LEFT OUTER JOIN Orders o ON c.cust_id = o.cust_id
GROUP BY c.cust_id, c.cust_name;

Combined Queries (UNION)

Understanding Combined Queries

Combined queries (also called unions or compound queries) allow you to execute multiple SELECT statements and return results as a single result set.

When to Use Combined Queries

Two primary scenarios:

  1. Different tables with similar structure

  2. Combine customer data from multiple regional databases

  3. Merge current and archived order data

  4. Multiple queries against same table

  5. Complex filtering that's easier with separate queries

  6. Alternative to complex WHERE clauses

UNION vs Multiple WHERE Clauses

Most combined queries on the same table can be written as single queries with multiple WHERE conditions, but UNION offers more flexibility and sometimes better readability.

Creating Combined Queries with UNION

Basic UNION Syntax

SELECT column1, column2, column3
FROM table1
WHERE condition1
UNION
SELECT column1, column2, column3
FROM table2
WHERE condition2;

Practical Example: Customer Report

Requirement: Get all customers in IL, IN, MI, plus all Fun4All locations regardless of state.

Individual Queries

Query 1: Customers in specific states

SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_state IN ('IL','IN','MI');

Output:

cust_name       cust_contact    cust_email
-----------     -------------   ------------
Village Toys    John Smith      sales@villagetoys.com
Fun4All         Jim Jones       jjones@fun4all.com
The Toy Store   Kim Howard      NULL

Query 2: All Fun4All locations

SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_name = 'Fun4All';

Output:

cust_name       cust_contact          cust_email
------------    -----------           -------------
Fun4All         Jim Jones             jjones@fun4all.com
Fun4All         Denise L. Stephens    dstephens@fun4all.com

Combined with UNION

SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_state IN ('IL','IN','MI')
UNION
SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_name = 'Fun4All';

Output:

cust_name      cust_contact          cust_email
-----------    -----------           ----------------
Fun4All        Denise L. Stephens    dstephens@fun4all.com
Fun4All        Jim Jones             jjones@fun4all.com
Village Toys   John Smith            sales@villagetoys.com
The Toy Store  Kim Howard            NULL

Notice: Only 4 rows returned (not 5) because UNION automatically removes duplicates.

Equivalent WHERE Clause Approach

SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_state IN ('IL','IN','MI')
   OR cust_name = 'Fun4All';

UNION Rules and Requirements

Essential Rules

  1. Two or more SELECT statements separated by UNION
  2. Same number of columns in each SELECT
  3. Compatible data types (must be convertible)
  4. Column order matters (first query determines column names)

Column Compatibility

-- ✅ Valid: Same columns, compatible types
SELECT prod_id, prod_name, prod_price
FROM Products
WHERE prod_price < 5
UNION
SELECT prod_id, prod_name, prod_price
FROM Products
WHERE vend_id = 'DLL01';
 
-- ❌ Invalid: Different number of columns
SELECT prod_id, prod_name
FROM Products
UNION
SELECT prod_id, prod_name, prod_price  -- Extra column
FROM Products;

Column Names in Results

-- First query determines column names
SELECT prod_name, prod_price    -- These names used in result
FROM Products
WHERE prod_price < 5
UNION
SELECT productname, price       -- These names ignored
FROM Products
WHERE vend_id = 'DLL01';
-- Result columns: prod_name, prod_price

Using Aliases for Column Names

SELECT prod_name AS product, prod_price AS cost
FROM Products
WHERE prod_price < 5
UNION
SELECT prod_name, prod_price
FROM Products
WHERE vend_id = 'DLL01';
-- Result columns: product, cost

UNION vs UNION ALL

UNION (Default - Removes Duplicates)

SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_state IN ('IL','IN','MI')
UNION
SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_name = 'Fun4All';
-- Returns 4 rows (duplicate removed)

UNION ALL (Keeps All Rows)

SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_state IN ('IL','IN','MI')
UNION ALL
SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_name = 'Fun4All';

Output:

cust_name      cust_contact           cust_email
-----------    -------------          ------------
Village Toys   John Smith             sales@villagetoys.com
Fun4All        Jim Jones              jjones@fun4all.com
The Toy Store  Kim Howard             NULL
Fun4All        Jim Jones              jjones@fun4all.com    -- Duplicate kept
Fun4All        Denise L. Stephens     dstephens@fun4all.com

When to Use Each

Use UNION whenUse UNION ALL when
You want unique resultsYou need all occurrences
Duplicates should be eliminatedDuplicates are meaningful
Performance isn't criticalPerformance is critical

Performance Tip: UNION ALL is faster because it doesn't need to check for duplicates.

Sorting Combined Results

Single ORDER BY Rule

Only one ORDER BY clause allowed, and it must come after the final SELECT.

SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_state IN ('IL','IN','MI')
UNION
SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_name = 'Fun4All'
ORDER BY cust_name, cust_contact;  -- Sorts entire result set

Output:

cust_name       cust_contact          cust_email
-----------     -------------         -------------
Fun4All         Denise L. Stephens    dstephens@fun4all.com
Fun4All         Jim Jones             jjones@fun4all.com
The Toy Store   Kim Howard            NULL
Village Toys    John Smith            sales@villagetoys.com

Column References in ORDER BY

-- ✅ Valid: Use column names from first SELECT
ORDER BY cust_name
 
-- ✅ Valid: Use column positions
ORDER BY 1, 2
 
-- ❌ Invalid: Use column names not in first SELECT
ORDER BY customer_name  -- If first SELECT uses cust_name

Advanced UNION Techniques

Multiple Table UNION

-- Combine data from different tables
SELECT 'Customer' AS type, cust_name AS name, cust_city AS city
FROM Customers
UNION
SELECT 'Vendor' AS type, vend_name AS name, vend_city AS city
FROM Vendors
ORDER BY type, name;

Complex Filtering with UNION

-- High-value orders OR recent orders
SELECT order_num, order_date, 'High Value' AS category
FROM Orders o
WHERE (SELECT SUM(item_price * quantity) 
       FROM OrderItems oi 
       WHERE oi.order_num = o.order_num) > 100
UNION
SELECT order_num, order_date, 'Recent' AS category
FROM Orders
WHERE order_date >= DATEADD(day, -30, GETDATE())
ORDER BY order_date DESC;

Data Type Conversion

-- Mixing compatible types
SELECT prod_id, CAST(prod_price AS VARCHAR(10)) AS value
FROM Products
UNION
SELECT CAST(cust_id AS VARCHAR(10)), cust_name
FROM Customers;

Challenge Solutions

Challenge 1: Product Quantities

SELECT prod_id, quantity
FROM OrderItems
WHERE quantity = 100
UNION
SELECT prod_id, quantity
FROM OrderItems
WHERE prod_id LIKE 'BNBG%'
ORDER BY prod_id;

Challenge 2: Single SELECT Alternative

SELECT prod_id, quantity
FROM OrderItems
WHERE quantity = 100
   OR prod_id LIKE 'BNBG%'
ORDER BY prod_id;

Challenge 3: Products and Customers

SELECT prod_name AS name
FROM Products
UNION
SELECT cust_name AS name
FROM Customers
ORDER BY name;

Challenge 4: What's Wrong?

-- ❌ INCORRECT: ORDER BY in wrong position
SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_state = 'MI'
ORDER BY cust_name;        -- ❌ ORDER BY before UNION
UNION
SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_state = 'IL'
ORDER BY cust_name;        -- ❌ Multiple ORDER BY clauses
 
-- ✅ CORRECT:
SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_state = 'MI'
UNION
SELECT cust_name, cust_contact, cust_email
FROM Customers
WHERE cust_state = 'IL'
ORDER BY cust_name;        -- ✅ Single ORDER BY at end

Other UNION Types (Advanced)

EXCEPT/MINUS

-- Rows in first query but NOT in second
SELECT prod_id FROM Products
EXCEPT
SELECT prod_id FROM OrderItems;
-- Products never ordered

INTERSECT

-- Rows that exist in BOTH queries
SELECT prod_id FROM Products
INTERSECT
SELECT prod_id FROM OrderItems;
-- Products that have been ordered

Note: EXCEPT and INTERSECT are less commonly supported and used than UNION.

Performance Considerations

Query Optimizer Impact

  • Most DBMSs optimize UNION queries internally
  • Performance difference between UNION and complex WHERE is often minimal
  • Always test both approaches with your specific data

Best Practices

  1. Use UNION ALL when duplicates don't matter (faster)
  2. Index join columns used in WHERE clauses
  3. Test performance with realistic data volumes
  4. Consider alternatives like JOINs for complex scenarios

Data Insertion (INSERT)

Understanding Data Insertion

The INSERT statement is used to add new rows to database tables. It's one of the three most frequently used SQL statements alongside SELECT and UPDATE/DELETE.

Three main INSERT methods:

  1. Inserting a single complete row
  2. Inserting a single partial row
  3. Inserting results of a query

Security Note: INSERT operations may require special privileges in client/server DBMSs.

Inserting Complete Rows

Basic INSERT Syntax (Not Recommended)

-- ❌ Unsafe: Depends on column order
INSERT INTO Customers
VALUES(1000000006,
       'Toy Land',
       '123 Any Street',
       'New York',
       'NY',
       '11111',
       'USA',
       NULL,
       NULL);

Problems with basic syntax:

  • Depends on column order in table definition
  • Breaks if table structure changes
  • No guarantee of column order consistency

Safe INSERT Syntax (Recommended)

-- ✅ Safe: Explicitly specifies columns
INSERT INTO Customers(cust_id,
                      cust_name,
                      cust_address,
                      cust_city,
                      cust_state,
                      cust_zip,
                      cust_country,
                      cust_contact,
                      cust_email)
VALUES(1000000006,
       'Toy Land',
       '123 Any Street',
       'New York',
       'NY',
       '11111',
       'USA',
       NULL,
       NULL);

Advantages of explicit column specification:

  • Order independent: Columns can be in any order
  • Future-proof: Works even if table structure changes
  • Self-documenting: Clear what data goes where
  • Error prevention: Reduces mistakes

Flexible Column Ordering

-- Columns can be specified in any order
INSERT INTO Customers(cust_id,
                      cust_contact,      -- Different order
                      cust_email,        -- Different order
                      cust_name,
                      cust_address,
                      cust_city,
                      cust_state,
                      cust_zip,
                      cust_country)
VALUES(1000000007,
       NULL,                -- Matches cust_contact
       NULL,                -- Matches cust_email
       'Game Store',
       '456 Game Ave',
       'Chicago',
       'IL',
       '60601',
       'USA');

Primary Key Constraints

-- ❌ This will fail if cust_id already exists
INSERT INTO Customers(cust_id, cust_name, ...)
VALUES(1000000006, 'Duplicate Store', ...);
-- Error: Primary key violation

Important: Primary key values must be unique. Attempting to insert duplicate primary keys will result in an error.

Inserting Partial Rows

You can omit columns when certain conditions are met:

Valid Partial Row Insert

INSERT INTO Customers(cust_id,
                      cust_name,
                      cust_address,
                      cust_city,
                      cust_state,
                      cust_zip,
                      cust_country)
VALUES(1000000008,
       'Quick Shop',
       '789 Fast Lane',
       'Miami',
       'FL',
       '33101',
       'USA');
-- cust_contact and cust_email omitted

When You Can Omit Columns

ConditionDescription
NULL allowedColumn definition allows NULL values
Default valueColumn has a default value specified

When You Cannot Omit Columns

-- ❌ This will fail if cust_name is required
INSERT INTO Customers(cust_id,
                      cust_address,
                      cust_city)
VALUES(1000000009,
       '321 Main St',
       'Boston');
-- Error: cust_name cannot be NULL

Inserting Retrieved Data (INSERT SELECT)

INSERT SELECT allows you to insert data from another table or query result.

Basic INSERT SELECT

INSERT INTO Customers(cust_id,
                      cust_contact,
                      cust_email,
                      cust_name,
                      cust_address,
                      cust_city,
                      cust_state,
                      cust_zip,
                      cust_country)
SELECT cust_id,
       cust_contact,
       cust_email,
       cust_name,
       cust_address,
       cust_city,
       cust_state,
       cust_zip,
       cust_country
FROM CustNew;

Key Points:

  • Column positions matter, not names
  • Number of columns must match
  • Data types must be compatible
  • Inserts multiple rows in single operation

INSERT SELECT with Filtering

-- Insert only active customers from another table
INSERT INTO Customers(cust_id, cust_name, cust_city, cust_state)
SELECT cust_id, cust_name, cust_city, cust_state
FROM CustImport
WHERE status = 'ACTIVE'
  AND cust_city IS NOT NULL;

INSERT SELECT with Transformations

-- Insert with data transformations
INSERT INTO Customers(cust_id, cust_name, cust_email, cust_country)
SELECT customer_number,
       UPPER(customer_name),           -- Transform to uppercase
       LOWER(email_address),           -- Transform to lowercase
       'USA'                           -- Constant value
FROM ImportTable
WHERE import_date >= '2024-01-01';

Copying Tables (CREATE SELECT)

CREATE SELECT creates a new table and copies data in one operation.

Standard Syntax

-- Most DBMSs
CREATE TABLE CustCopy AS 
SELECT * FROM Customers;

SQL Server Syntax

-- SQL Server specific
SELECT * INTO CustCopy 
FROM Customers;

Selective Column Copying

-- Copy only specific columns
CREATE TABLE CustSummary AS
SELECT cust_id, cust_name, cust_city, cust_state
FROM Customers
WHERE cust_country = 'USA';

Advanced CREATE SELECT Examples

-- Copy with joins and calculations
CREATE TABLE CustomerOrders AS
SELECT c.cust_id,
       c.cust_name,
       COUNT(o.order_num) AS total_orders,
       COALESCE(SUM(oi.item_price * oi.quantity), 0) AS total_spent
FROM Customers c
LEFT JOIN Orders o ON c.cust_id = o.cust_id
LEFT JOIN OrderItems oi ON o.order_num = oi.order_num
GROUP BY c.cust_id, c.cust_name;

CREATE SELECT Features

FeatureSupported
WHERE clauses✅ Yes
GROUP BY✅ Yes
ORDER BY✅ Yes
JOINs✅ Yes
Multiple target tables❌ No

DBMS Support: DB2 does not support CREATE SELECT syntax.

Best Practices

1. Always Specify Columns

-- ✅ Good: Explicit and safe
INSERT INTO Customers(cust_id, cust_name, cust_city)
VALUES(1000000010, 'Best Store', 'Seattle');
 
-- ❌ Bad: Depends on table structure
INSERT INTO Customers
VALUES(1000000010, 'Best Store', NULL, 'Seattle', ...);

2. Handle NULL Values Properly

-- Explicit NULL handling
INSERT INTO Customers(cust_id, cust_name, cust_contact, cust_email)
VALUES(1000000011, 'New Store', NULL, NULL);
 
-- Or omit NULL columns entirely
INSERT INTO Customers(cust_id, cust_name)
VALUES(1000000011, 'New Store');

3. Use Transactions for Multiple Inserts

BEGIN TRANSACTION;
 
INSERT INTO Customers(cust_id, cust_name) VALUES(1000000012, 'Store A');
INSERT INTO Customers(cust_id, cust_name) VALUES(1000000013, 'Store B');
INSERT INTO Customers(cust_id, cust_name) VALUES(1000000014, 'Store C');
 
COMMIT;

4. Validate Data Before Insertion

-- Check for duplicates before inserting
INSERT INTO Customers(cust_id, cust_name, cust_email)
SELECT 1000000015, 'Unique Store', 'unique@store.com'
WHERE NOT EXISTS (
    SELECT 1 FROM Customers 
    WHERE cust_email = 'unique@store.com'
);

Performance Considerations

Single Row vs Bulk Operations

| Method | Performance | Use Case |-----|----- | Single INSERT | Slower | One-off additions | INSERT SELECT | Faster | Bulk data migration | Multiple INSERTs | Medium | Batch processing

Optimizing INSERT Performance

-- Disable indexes temporarily for large bulk inserts
-- (Database-specific syntax)
 
-- Use INSERT SELECT for bulk operations
INSERT INTO TargetTable(col1, col2, col3)
SELECT col1, col2, col3
FROM SourceTable
WHERE condition;
 
-- Batch multiple single inserts in transactions

Challenge Solutions

Challenge 1: Add Yourself to Customers

INSERT INTO Customers(cust_id,
                      cust_name,
                      cust_address,
                      cust_city,
                      cust_state,
                      cust_zip,
                      cust_country,
                      cust_contact,
                      cust_email)
VALUES(1000000020,
       'John Doe',
       '123 Main Street',
       'Anytown',
       'CA',
       '90210',
       'USA',
       'John Doe',
       'john.doe@email.com');

Challenge 2: Backup Tables

-- Backup Orders table
CREATE TABLE OrdersBackup AS
SELECT * FROM Orders;
 
-- Backup OrderItems table  
CREATE TABLE OrderItemsBackup AS
SELECT * FROM OrderItems;
 
-- Or using SQL Server syntax:
SELECT * INTO OrdersBackup FROM Orders;
SELECT * INTO OrderItemsBackup FROM OrderItems;

Common Errors and Solutions

1. Primary Key Violations

-- Error: Duplicate primary key
-- Solution: Check existing values first
SELECT MAX(cust_id) FROM Customers;  -- Find next available ID

2. NULL Constraint Violations

-- Error: Column doesn't allow NULL
-- Solution: Provide value or check table definition
DESCRIBE Customers;  -- Check which columns allow NULL

3. Data Type Mismatches

-- Error: Invalid data type
-- Solution: Cast or convert data types
INSERT INTO Customers(cust_id, cust_name)
VALUES(CAST('1000000021' AS INTEGER), 'Type Safe Store');

Data Updating and Deleting (UPDATE & DELETE)

Understanding Data Modification

The UPDATE and DELETE statements are essential for maintaining and modifying database content. Both are powerful but potentially dangerous operations that require careful handling.

Critical Warning: Both UPDATE and DELETE can affect every row in a table if used without proper WHERE clauses.

Updating Data (UPDATE Statement)

UPDATE modifies existing data in table rows and can be used in two ways:

  1. Update specific rows (with WHERE clause)
  2. Update all rows (without WHERE clause - dangerous!)

Basic UPDATE Syntax

UPDATE table_name
SET column1 = value1,
    column2 = value2
WHERE condition;

Three essential components:

  1. Table to update
  2. Column names and new values
  3. Filter condition (WHERE clause)

Single Column Update

-- Update email for specific customer
UPDATE Customers
SET cust_email = 'kim@thetoystore.com'
WHERE cust_id = 1000000005;

How it works:

  • UPDATE Customers: Specifies target table
  • SET cust_email = 'value': Assigns new value to column
  • WHERE cust_id = 1000000005: Filters to specific row

Multiple Column Update

-- Update contact and email for specific customer
UPDATE Customers
SET cust_contact = 'Sam Roberts',
    cust_email = 'sam@toyland.com'
WHERE cust_id = 1000000006;

Key points:

  • Single SET command for multiple columns
  • Comma-separated column = value pairs
  • No comma after last column

Setting Values to NULL

-- Remove email address (set to NULL)
UPDATE Customers
SET cust_email = NULL
WHERE cust_id = 1000000005;

NULL vs Empty String:

  • NULL: No value at all
  • Empty string (''): Actual value that happens to be empty

Advanced UPDATE Techniques

Using Subqueries in UPDATE

-- Update customer's state based on zip code lookup
UPDATE Customers
SET cust_state = (
    SELECT state_code 
    FROM ZipCodes 
    WHERE ZipCodes.zip = Customers.cust_zip
)
WHERE cust_country = 'USA';

Conditional Updates

-- Update prices with different logic
UPDATE Products
SET prod_price = CASE 
    WHEN prod_price < 10 THEN prod_price * 1.1  -- 10% increase
    WHEN prod_price < 50 THEN prod_price * 1.05 -- 5% increase
    ELSE prod_price * 1.02                      -- 2% increase
END
WHERE vend_id = 'DLL01';

UPDATE with Calculations

-- Apply discount based on order quantity
UPDATE OrderItems
SET item_price = item_price * 0.9  -- 10% discount
WHERE quantity >= 10;

UPDATE with JOINs (Some DBMSs)

-- Update customer info from another table (SQL Server syntax)
UPDATE Customers
SET cust_email = ci.email_address
FROM Customers c
INNER JOIN CustomerImport ci ON c.cust_id = ci.customer_id
WHERE ci.email_address IS NOT NULL;

Deleting Data (DELETE Statement)

DELETE removes entire rows from tables and can be used in two ways:

  1. Delete specific rows (with WHERE clause)
  2. Delete all rows (without WHERE clause - very dangerous!)

Basic DELETE Syntax

DELETE FROM table_name
WHERE condition;

Single Row Deletion

-- Delete specific customer
DELETE FROM Customers
WHERE cust_id = 1000000006;

Key characteristics:

  • DELETE FROM: Specifies table
  • No column names: Deletes entire rows
  • WHERE clause: Critical for targeting specific rows

Multiple Row Deletion

-- Delete all customers from specific state
DELETE FROM Customers
WHERE cust_state = 'XX';
 
-- Delete old orders
DELETE FROM Orders
WHERE order_date < '2020-01-01';

DELETE vs TRUNCATE

OperationDELETETRUNCATE
SpeedSlowerMuch faster
WHERE clauseSupportedNot supported
LoggingFull loggingMinimal logging
RollbackCan rollbackLimited rollback
TriggersFires triggersMay not fire triggers
-- Delete all rows (slow, logged)
DELETE FROM TempTable;
 
-- Delete all rows (fast, minimal logging)
TRUNCATE TABLE TempTable;

Safety Guidelines and Best Practices

1. Always Use WHERE Clauses

-- ✅ Safe: Targets specific rows
UPDATE Customers
SET cust_email = 'new@email.com'
WHERE cust_id = 1000000005;
 
-- ❌ Dangerous: Updates ALL rows
UPDATE Customers
SET cust_email = 'new@email.com';

2. Test with SELECT First

-- Step 1: Test your WHERE clause
SELECT cust_id, cust_name, cust_email
FROM Customers
WHERE cust_state = 'CA';
 
-- Step 2: If results look correct, then update
UPDATE Customers
SET cust_email = LOWER(cust_email)
WHERE cust_state = 'CA';

3. Use Primary Keys When Possible

-- ✅ Best: Use primary key for precision
DELETE FROM Customers
WHERE cust_id = 1000000006;
 
-- ✅ Good: Use unique constraints
UPDATE Customers
SET cust_contact = 'New Contact'
WHERE cust_email = 'unique@email.com';

4. Backup Before Major Operations

-- Create backup before bulk operations
CREATE TABLE CustomersBackup AS
SELECT * FROM Customers;
 
-- Perform bulk update
UPDATE Customers
SET cust_state = UPPER(cust_state)
WHERE cust_country = 'USA';

5. Use Transactions for Safety

-- Start transaction
BEGIN TRANSACTION;
 
-- Perform updates
UPDATE Customers
SET cust_email = LOWER(cust_email)
WHERE cust_country = 'USA';
 
-- Check results
SELECT COUNT(*) FROM Customers WHERE cust_email LIKE '%@%';
 
-- If satisfied, commit; otherwise rollback
COMMIT;
-- OR
-- ROLLBACK;

Foreign Key Protection

Referential Integrity Benefits:

-- This will fail if customer has orders
DELETE FROM Customers
WHERE cust_id = 1000000001;
-- Error: Cannot delete referenced row
 
-- This protects against orphaned records
DELETE FROM Vendors
WHERE vend_id = 'DLL01';
-- Error: Products reference this vendor

Foreign keys prevent:

  • Orphaned records in child tables
  • Accidental deletion of referenced data
  • Data integrity violations

Common Patterns and Examples

Bulk Data Cleanup

-- Remove inactive customers (no orders in 2 years)
DELETE FROM Customers
WHERE cust_id NOT IN (
    SELECT DISTINCT cust_id
    FROM Orders
    WHERE order_date >= DATEADD(year, -2, GETDATE())
);

Data Standardization

-- Standardize phone number format
UPDATE Customers
SET cust_phone = REPLACE(REPLACE(REPLACE(cust_phone, '(', ''), ')', ''), '-', '')
WHERE cust_phone IS NOT NULL;
 
-- Standardize state codes to uppercase
UPDATE Customers
SET cust_state = UPPER(cust_state)
WHERE cust_country = 'USA';

Conditional Deletion

-- Delete duplicate records (keep newest)
DELETE c1 FROM Customers c1
INNER JOIN Customers c2
WHERE c1.cust_email = c2.cust_email
  AND c1.cust_id < c2.cust_id;

Error Handling and Recovery

Common Errors:

  1. Constraint Violations
-- Error: Foreign key constraint
DELETE FROM Vendors WHERE vend_id = 'DLL01';
-- Solution: Delete dependent records first
  1. Missing WHERE Clause
-- Accidentally updates all rows
UPDATE Products SET prod_price = 9.99;
-- Solution: Use transactions and test first
  1. Data Type Mismatches
-- Error: Invalid data type
UPDATE Customers SET cust_id = 'ABC123';
-- Solution: Ensure compatible data types

Recovery Strategies:

-- 1. Restore from backup
RESTORE TABLE Customers FROM BACKUP;
 
-- 2. Rollback transaction
ROLLBACK;
 
-- 3. Reverse operation (if possible)
UPDATE Customers
SET cust_email = old_email_backup
WHERE cust_id IN (affected_ids);

Challenge Solutions

Challenge 1: Uppercase State Abbreviations

-- Update vendor states to uppercase
UPDATE Vendors
SET vend_state = UPPER(vend_state)
WHERE vend_country = 'USA';
 
-- Update customer states to uppercase
UPDATE Customers
SET cust_state = UPPER(cust_state)
WHERE cust_country = 'USA';

Challenge 2: Delete Yourself from Customers

-- Step 1: Test the WHERE clause first
SELECT cust_id, cust_name, cust_email
FROM Customers
WHERE cust_name = 'Your Name'  -- Replace with your actual name
  AND cust_email = 'your.email@domain.com';  -- Replace with your email
 
-- Step 2: If results look correct, then delete
DELETE FROM Customers
WHERE cust_name = 'Your Name'
  AND cust_email = 'your.email@domain.com';

Advanced Safety Techniques

1. Row Count Validation

-- Check how many rows will be affected
SELECT COUNT(*) AS rows_to_update
FROM Customers
WHERE cust_state = 'ca';
 
-- If count looks reasonable, proceed
UPDATE Customers
SET cust_state = UPPER(cust_state)
WHERE cust_state = 'ca';

2. Staged Updates

-- Update in small batches
UPDATE Customers
SET cust_email = LOWER(cust_email)
WHERE cust_id BETWEEN 1000000001 AND 1000000010;
 
-- Check results before continuing
SELECT * FROM Customers WHERE cust_id BETWEEN 1000000001 AND 1000000010;

3. Audit Trail

-- Log changes before making them
INSERT INTO ChangeLog (table_name, operation, old_values, new_values, change_date)
SELECT 'Customers', 'UPDATE', 
       CONCAT('email:', cust_email), 
       CONCAT('email:', LOWER(cust_email)),
       GETDATE()
FROM Customers
WHERE cust_email != LOWER(cust_email);
 
-- Then perform the update
UPDATE Customers
SET cust_email = LOWER(cust_email)
WHERE cust_email != LOWER(cust_email);

Table Creation and Manipulation

Understanding Table Operations

SQL provides powerful statements for table structure management:

  • CREATE TABLE: Create new tables
  • ALTER TABLE: Modify existing table structure
  • DROP TABLE: Delete entire tables
  • RENAME TABLE: Change table names (DBMS-specific)

Important: These operations affect table structure, not just data, and should be used with extreme caution.

Creating Tables (CREATE TABLE)

Two approaches to table creation:

  1. Interactive tools: GUI-based database management tools
  2. SQL statements: Direct programmatic control

Basic CREATE TABLE Syntax

CREATE TABLE table_name
(
    column1_name    datatype    constraints,
    column2_name    datatype    constraints,
    column3_name    datatype    constraints
);

Essential Information Required:

  1. Table name (must be unique in database)
  2. Column names and definitions
  3. Data types for each column
  4. Constraints (NULL/NOT NULL, defaults, etc.)

Complete Table Creation Example

CREATE TABLE Products
(
    prod_id      CHAR(10)        NOT NULL,
    vend_id      CHAR(10)        NOT NULL,
    prod_name    CHAR(254)       NOT NULL,
    prod_price   DECIMAL(8,2)    NOT NULL,
    prod_desc    VARCHAR(1000)   NULL
);

Column Definition Components:

  • prod_id: Column name
  • CHAR(10): Data type with length
  • NOT NULL: Constraint (required field)

Data Types Overview

CategoryExamplesDescription
CharacterCHAR(n), VARCHAR(n)Fixed/variable length text
NumericINTEGER, DECIMAL(p,s)Whole numbers, decimals
Date/TimeDATE, DATETIME, TIMESTAMPDate and time values
BooleanBOOLEAN, BITTrue/false values

Working with NULL Values

NULL vs NOT NULL Constraints

CREATE TABLE Orders
(
    order_num     INTEGER     NOT NULL,    -- Required field
    order_date    DATETIME    NOT NULL,    -- Required field
    cust_id       CHAR(10)    NOT NULL     -- Required field
);

Mixed NULL/NOT NULL Example

CREATE TABLE Vendors
(
    vend_id        CHAR(10)    NOT NULL,   -- Required
    vend_name      CHAR(50)    NOT NULL,   -- Required
    vend_address   CHAR(50)    ,           -- Optional (NULL allowed)
    vend_city      CHAR(50)    ,           -- Optional
    vend_state     CHAR(5)     ,           -- Optional
    vend_zip       CHAR(10)    ,           -- Optional
    vend_country   CHAR(50)                -- Optional
);

NULL Value Rules:

  • NOT NULL: Column must have a value
  • NULL (default): Column can be empty
  • Primary keys: Must always be NOT NULL

Important Distinctions:

Value TypeDescriptionExample
NULLNo value at allNULL
Empty stringValid empty text''
ZeroNumeric value0
-- These are all different:
INSERT INTO Products (prod_id, prod_name, prod_price)
VALUES ('P001', '', 0);        -- Empty string, zero price
-- vs
VALUES ('P002', NULL, NULL);   -- NULL name and price

Specifying Default Values

DEFAULT keyword provides automatic values when none specified.

Basic Default Values

CREATE TABLE OrderItems
(
    order_num     INTEGER         NOT NULL,
    order_item    INTEGER         NOT NULL,
    prod_id       CHAR(10)        NOT NULL,
    quantity      INTEGER         NOT NULL    DEFAULT 1,      -- Default to 1
    item_price    DECIMAL(8,2)    NOT NULL
);

System Date Defaults

Different DBMSs use different functions for current date/time:

DBMSFunctionExample
DB2CURRENT_DATEDEFAULT CURRENT_DATE
MySQLCURRENT_DATE(), NOW()DEFAULT CURRENT_DATE()
OracleSYSDATEDEFAULT SYSDATE
PostgreSQLCURRENT_DATEDEFAULT CURRENT_DATE
SQL ServerGETDATE()DEFAULT GETDATE()
SQLitedate('now')DEFAULT date('now')

Date Default Example

CREATE TABLE Orders
(
    order_num     INTEGER     NOT NULL,
    order_date    DATETIME    NOT NULL    DEFAULT GETDATE(),  -- SQL Server
    cust_id       CHAR(10)    NOT NULL,
    status        CHAR(10)    DEFAULT 'PENDING'               -- Text default
);

Updating Table Structure (ALTER TABLE)

ALTER TABLE modifies existing table structure, but capabilities vary significantly between DBMSs.

Common ALTER TABLE Limitations:

  • Adding columns: Usually supported
  • Removing columns: Often restricted
  • Changing data types: Limited support
  • Populated tables: More restrictions

Adding Columns

-- Add new column to existing table
ALTER TABLE Vendors
ADD vend_phone CHAR(20);

Adding Multiple Columns

-- Add several columns at once (some DBMSs)
ALTER TABLE Vendors
ADD vend_email VARCHAR(100),
ADD vend_website VARCHAR(200);

Adding Columns with Constraints

-- Add column with default value
ALTER TABLE Products
ADD prod_weight DECIMAL(5,2) DEFAULT 0.00;
 
-- Add required column (be careful with existing data!)
ALTER TABLE Customers
ADD cust_type CHAR(10) NOT NULL DEFAULT 'REGULAR';

Dropping Columns (Limited Support)

-- Remove column (not supported by all DBMSs)
ALTER TABLE Vendors
DROP COLUMN vend_phone;

Complex Table Changes Process

For major structural changes, use this manual process:

-- 1. Create new table with desired structure
CREATE TABLE Vendors_New
(
    vend_id        CHAR(10)     NOT NULL,
    vend_name      CHAR(50)     NOT NULL,
    vend_contact   VARCHAR(100),           -- New column
    vend_city      CHAR(50),
    vend_country   CHAR(50)
    -- Note: removed vend_address, vend_state, vend_zip
);
 
-- 2. Copy data from old table
INSERT INTO Vendors_New (vend_id, vend_name, vend_city, vend_country)
SELECT vend_id, vend_name, vend_city, vend_country
FROM Vendors;
 
-- 3. Verify data integrity
SELECT COUNT(*) FROM Vendors;     -- Original count
SELECT COUNT(*) FROM Vendors_New; -- New count should match
 
-- 4. Rename tables
DROP TABLE Vendors;               -- Remove old table
ALTER TABLE Vendors_New RENAME TO Vendors; -- Rename new table

Deleting Tables (DROP TABLE)

DROP TABLE permanently removes entire tables and all data.

Basic DROP TABLE Syntax

-- Permanently delete table
DROP TABLE CustCopy;

Safety Considerations:

  • No confirmation prompt
  • No undo capability
  • All data permanently lost
  • All indexes, triggers, constraints removed

Safe Deletion Process

-- 1. Create backup first
CREATE TABLE CustCopy_Backup AS
SELECT * FROM CustCopy;
 
-- 2. Verify backup
SELECT COUNT(*) FROM CustCopy;
SELECT COUNT(*) FROM CustCopy_Backup;
 
-- 3. Drop original table
DROP TABLE CustCopy;

Foreign Key Protection

-- This may fail if table is referenced by foreign keys
DROP TABLE Vendors;
-- Error: Cannot drop table referenced by foreign key
 
-- Solution: Drop dependent tables first, or remove foreign keys
ALTER TABLE Products DROP FOREIGN KEY fk_products_vendors;
DROP TABLE Vendors;

Renaming Tables

Table renaming syntax varies significantly between DBMSs:

MySQL, MariaDB, Oracle, PostgreSQL:

RENAME TABLE old_name TO new_name;
-- or
ALTER TABLE old_name RENAME TO new_name;

SQL Server:

EXEC sp_rename 'old_name', 'new_name';

SQLite:

ALTER TABLE old_name RENAME TO new_name;

DB2:

RENAME TABLE old_name TO new_name;

Best Practices and Safety Guidelines

1. Always Backup Before Changes

-- Create complete backup
CREATE TABLE Products_Backup AS
SELECT * FROM Products;
 
-- Or export to file (DBMS-specific)

2. Test on Development Environment

-- Test structure changes on copy first
CREATE TABLE Products_Test AS
SELECT * FROM Products;
 
ALTER TABLE Products_Test
ADD new_column VARCHAR(50);
 
-- Verify results before applying to production

3. Plan Table Design Carefully

-- Good initial design reduces need for ALTER TABLE
CREATE TABLE Customers
(
    cust_id          INTEGER         NOT NULL    PRIMARY KEY,
    cust_name        VARCHAR(100)    NOT NULL,
    cust_email       VARCHAR(255),               -- Plan for long emails
    cust_phone       VARCHAR(20),                -- Plan for international formats
    cust_address     TEXT,                       -- Flexible address storage
    cust_created     DATETIME        DEFAULT GETDATE(),
    cust_status      CHAR(10)        DEFAULT 'ACTIVE',
    cust_notes       TEXT                        -- Future expansion
);

4. Use Transactions for Safety

BEGIN TRANSACTION;
 
-- Perform table operations
ALTER TABLE Products
ADD prod_category VARCHAR(50) DEFAULT 'General';
 
-- Verify results
SELECT COUNT(*) FROM Products WHERE prod_category IS NULL;
 
-- If satisfied, commit; otherwise rollback
COMMIT;
-- OR ROLLBACK;

Common Patterns and Examples

Audit Table Creation

CREATE TABLE Customer_Audit
(
    audit_id        INTEGER         NOT NULL    PRIMARY KEY,
    cust_id         INTEGER         NOT NULL,
    operation       CHAR(10)        NOT NULL,   -- INSERT, UPDATE, DELETE
    old_values      TEXT,
    new_values      TEXT,
    changed_by      VARCHAR(50)     NOT NULL,
    changed_date    DATETIME        DEFAULT GETDATE()
);

Lookup Table Creation

CREATE TABLE Product_Categories
(
    category_id     INTEGER         NOT NULL    PRIMARY KEY,
    category_name   VARCHAR(50)     NOT NULL    UNIQUE,
    category_desc   TEXT,
    is_active       BOOLEAN         DEFAULT TRUE,
    created_date    DATETIME        DEFAULT GETDATE()
);

Temporary Table Creation

-- Create temporary table for data processing
CREATE TABLE #TempSales  -- SQL Server syntax
(
    sale_id         INTEGER,
    sale_amount     DECIMAL(10,2),
    sale_date       DATE
);

Challenge Solutions

Challenge 1: Add Website Column

-- Add website column to Vendors table
ALTER TABLE Vendors
ADD vend_web VARCHAR(255);  -- Large enough for URLs

Challenge 2: Update Vendor Websites

-- Update vendors with website information
UPDATE Vendors
SET vend_web = 'https://www.bearsrus.com'
WHERE vend_id = 'BRS01';
 
UPDATE Vendors
SET vend_web = 'https://www.dollhouse.com'
WHERE vend_id = 'DLL01';
 
UPDATE Vendors
SET vend_web = 'https://www.funandgames.co.uk'
WHERE vend_id = 'FNG01';
 
-- Or update multiple at once with CASE
UPDATE Vendors
SET vend_web = CASE vend_id
    WHEN 'BRS01' THEN 'https://www.bearsrus.com'
    WHEN 'DLL01' THEN 'https://www.dollhouse.com'
    WHEN 'FNG01' THEN 'https://www.funandgames.co.uk'
    ELSE vend_web
END
WHERE vend_id IN ('BRS01', 'DLL01', 'FNG01');

Advanced Table Management

Partitioned Tables

-- Create partitioned table (SQL Server example)
CREATE TABLE Sales_Partitioned
(
    sale_id         INTEGER         NOT NULL,
    sale_date       DATE            NOT NULL,
    sale_amount     DECIMAL(10,2)   NOT NULL
)
PARTITION BY RANGE (YEAR(sale_date))
(
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025),
    PARTITION p2025 VALUES LESS THAN (2026)
);

Computed Columns

-- Table with computed column (SQL Server)
CREATE TABLE OrderItems_Enhanced
(
    order_num       INTEGER         NOT NULL,
    quantity        INTEGER         NOT NULL,
    item_price      DECIMAL(8,2)    NOT NULL,
    line_total      AS (quantity * item_price)  -- Computed column
);

Views

Understanding Views

Views are virtual tables that contain queries rather than data. Unlike physical tables, views dynamically retrieve data when accessed.

Key Characteristics:

  • No data storage: Views contain only query definitions
  • Dynamic results: Data retrieved from underlying tables in real-time
  • Virtual abstraction: Appear and behave like regular tables
  • Query encapsulation: Hide complex SQL behind simple interfaces

Real-World Example

Instead of this complex query every time:

-- Complex join query
SELECT cust_name, cust_contact
FROM Customers, Orders, OrderItems
WHERE Customers.cust_id = Orders.cust_id
  AND OrderItems.order_num = Orders.order_num
  AND prod_id = 'RGAN01';

Create a view once:

CREATE VIEW ProductCustomers AS
SELECT cust_name, cust_contact, prod_id
FROM Customers, Orders, OrderItems
WHERE Customers.cust_id = Orders.cust_id
  AND OrderItems.order_num = Orders.order_num;

Then use simple queries:

-- Simple query using view
SELECT cust_name, cust_contact
FROM ProductCustomers
WHERE prod_id = 'RGAN01';

Why Use Views

1. Reuse SQL Statements

  • Write complex queries once
  • Use repeatedly without rewriting
  • Maintain consistency across applications

2. Simplify Complex Operations

  • Hide join complexity from users
  • Abstract complicated business logic
  • Provide intuitive data access

3. Expose Partial Data

  • Show only relevant columns
  • Filter sensitive information
  • Create focused data subsets

4. Secure Data Access

  • Grant access to views, not tables
  • Control what users can see
  • Implement row-level security

5. Format and Transform Data

  • Present data in user-friendly formats
  • Apply consistent calculations
  • Standardize data representation

View Rules and Restrictions

General Rules:

  • Unique names: Cannot duplicate table or view names
  • No limit: Create as many views as needed
  • Security required: Need appropriate permissions
  • Nesting allowed: Views can reference other views

Common Restrictions:

RestrictionDescriptionImpact
No ORDER BYMany DBMSs prohibit ORDER BY in viewsSort in SELECT instead
Named columnsCalculated fields must have aliasesUse AS keyword
No indexesViews cannot be indexedPerformance considerations
Read-onlySome DBMSs (SQLite) are read-onlyNo INSERT/UPDATE/DELETE
No triggersViews cannot have triggersLimited automation

Performance Considerations:

  • Complex views: Multiple joins can slow performance
  • Nested views: Each level adds processing overhead
  • Real-time processing: No data caching in views

Creating Views

Basic CREATE VIEW Syntax

CREATE VIEW view_name AS
SELECT column1, column2, ...
FROM table_name
WHERE conditions;

View Management:

-- Create view
CREATE VIEW ViewName AS SELECT ...;
 
-- Drop view
DROP VIEW ViewName;
 
-- Update view (drop and recreate)
DROP VIEW ViewName;
CREATE VIEW ViewName AS SELECT ...;

Using Views to Simplify Complex Joins

Complex Join Example

-- Create view for product customers
CREATE VIEW ProductCustomers AS
SELECT cust_name, cust_contact, prod_id
FROM Customers, Orders, OrderItems
WHERE Customers.cust_id = Orders.cust_id
  AND OrderItems.order_num = Orders.order_num;

Using the View

-- Find customers who ordered specific product
SELECT cust_name, cust_contact
FROM ProductCustomers
WHERE prod_id = 'RGAN01';

Output:

cust_name              cust_contact
-------------------    ------------------
Fun4All                Denise L. Stephens
The Toy Store          Kim Howard

Benefits:

  • Simplified queries: No need to remember join syntax
  • Reusable: Works for any product ID
  • Maintainable: Change join logic in one place

Using Views to Reformat Data

Data Formatting View

-- SQL Server syntax
CREATE VIEW VendorLocations AS
SELECT RTRIM(vend_name) + ' (' + RTRIM(vend_country) + ')'
       AS vend_title
FROM Vendors;
 
-- Standard SQL syntax
CREATE VIEW VendorLocations AS
SELECT RTRIM(vend_name) || ' (' || RTRIM(vend_country) || ')'
       AS vend_title
FROM Vendors;

Using the Formatting View

SELECT * FROM VendorLocations;

Output:

vend_title
-----------------------------------------------------------
Bear Emporium (USA)
Bears R Us (USA)
Doll House Inc. (USA)
Fun and Games (England)
Furball Inc. (USA)
Jouets et ours (France)

Advantages:

  • Consistent formatting: Same format everywhere
  • Easy maintenance: Change format in one place
  • User-friendly: Hide complex concatenation logic

Using Views to Filter Data

Filtering View Example

-- Create view for customers with email addresses
CREATE VIEW CustomerEMailList AS
SELECT cust_id, cust_name, cust_email
FROM Customers
WHERE cust_email IS NOT NULL;

Using the Filter View

SELECT * FROM CustomerEMailList;

Output:

cust_id     cust_name     cust_email
----------  ------------  ---------------------
1000000001  Village Toys  sales@villagetoys.com
1000000003  Fun4All       jjones@fun4all.com
1000000004  Fun4All       dstephens@fun4all.com

Combining WHERE Clauses:

-- Additional filtering on view
SELECT cust_name, cust_email
FROM CustomerEMailList
WHERE cust_name LIKE 'Fun%';
-- Combines view's WHERE with query's WHERE

Using Views with Calculated Fields

Calculated Field View

CREATE VIEW OrderItemsExpanded AS
SELECT order_num,
       prod_id,
       quantity,
       item_price,
       quantity * item_price AS expanded_price
FROM OrderItems;

Using the Calculated View

SELECT *
FROM OrderItemsExpanded
WHERE order_num = 20008;

Output:

order_num  prod_id   quantity   item_price   expanded_price
---------  -------   ---------  ----------   --------------
20008      RGAN01    5          4.99         24.95
20008      BR03      5          11.99        59.95
20008      BNBG01    10         3.49         34.90

Benefits:

  • Automatic calculations: No need to repeat formulas
  • Consistency: Same calculation logic everywhere
  • Simplicity: Hide complex mathematical operations

Advanced View Patterns

Security View

-- Employee view without salary information
CREATE VIEW EmployeePublic AS
SELECT emp_id, emp_name, emp_department, emp_title
FROM Employees;
-- Excludes emp_salary, emp_ssn, etc.

Aggregation View

-- Sales summary by customer
CREATE VIEW CustomerSales AS
SELECT c.cust_id,
       c.cust_name,
       COUNT(o.order_num) AS total_orders,
       SUM(oi.quantity * oi.item_price) AS total_spent
FROM Customers c
LEFT JOIN Orders o ON c.cust_id = o.cust_id
LEFT JOIN OrderItems oi ON o.order_num = oi.order_num
GROUP BY c.cust_id, c.cust_name;

Union View

-- Combine current and archived orders
CREATE VIEW AllOrders AS
SELECT order_num, order_date, cust_id, 'CURRENT' AS status
FROM Orders
UNION ALL
SELECT order_num, order_date, cust_id, 'ARCHIVED' AS status
FROM OrdersArchive;

Nested Views

-- Base view
CREATE VIEW ActiveCustomers AS
SELECT * FROM Customers
WHERE status = 'ACTIVE';
 
-- View using another view
CREATE VIEW ActiveCustomerOrders AS
SELECT ac.cust_name, o.order_num, o.order_date
FROM ActiveCustomers ac
JOIN Orders o ON ac.cust_id = o.cust_id;

Best Practices

1. Design for Reusability

-- ✅ Good: General purpose view
CREATE VIEW ProductCustomers AS
SELECT cust_name, cust_contact, prod_id
FROM Customers, Orders, OrderItems
WHERE Customers.cust_id = Orders.cust_id
  AND OrderItems.order_num = Orders.order_num;
 
-- ❌ Bad: Too specific
CREATE VIEW RGAN01Customers AS
SELECT cust_name, cust_contact
FROM Customers, Orders, OrderItems
WHERE Customers.cust_id = Orders.cust_id
  AND OrderItems.order_num = Orders.order_num
  AND prod_id = 'RGAN01';

2. Use Meaningful Names

-- ✅ Good: Descriptive names
CREATE VIEW CustomerOrderSummary AS ...
CREATE VIEW ActiveProductCatalog AS ...
 
-- ❌ Bad: Unclear names
CREATE VIEW View1 AS ...
CREATE VIEW TempView AS ...

3. Document Complex Views

-- Customer sales performance view
-- Includes: total orders, revenue, average order value
-- Excludes: inactive customers, cancelled orders
CREATE VIEW CustomerSalesMetrics AS
SELECT 
    c.cust_id,
    c.cust_name,
    COUNT(o.order_num) AS total_orders,
    SUM(oi.quantity * oi.item_price) AS total_revenue,
    AVG(oi.quantity * oi.item_price) AS avg_order_value
FROM Customers c
JOIN Orders o ON c.cust_id = o.cust_id
JOIN OrderItems oi ON o.order_num = oi.order_num
WHERE c.status = 'ACTIVE'
  AND o.status != 'CANCELLED'
GROUP BY c.cust_id, c.cust_name;

Challenge Solutions

Challenge 1: CustomersWithOrders View

-- Create view for customers who have placed orders
CREATE VIEW CustomersWithOrders AS
SELECT DISTINCT c.*
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id;
 
-- Test the view
SELECT cust_name, cust_contact
FROM CustomersWithOrders
ORDER BY cust_name;

Challenge 2: What's Wrong?

-- ❌ INCORRECT: ORDER BY not allowed in many DBMS views
CREATE VIEW OrderItemsExpanded AS
SELECT order_num,
       prod_id,
       quantity,
       item_price,
       quantity*item_price AS expanded_price
FROM OrderItems
ORDER BY order_num;  -- This causes the error
 
-- ✅ CORRECT: Remove ORDER BY from view
CREATE VIEW OrderItemsExpanded AS
SELECT order_num,
       prod_id,
       quantity,
       item_price,
       quantity*item_price AS expanded_price
FROM OrderItems;
 
-- Apply ORDER BY when querying the view
SELECT * FROM OrderItemsExpanded ORDER BY order_num;

Common View Patterns

Reporting Views

-- Monthly sales report view
CREATE VIEW MonthlySalesReport AS
SELECT 
    YEAR(o.order_date) AS sale_year,
    MONTH(o.order_date) AS sale_month,
    COUNT(DISTINCT o.order_num) AS total_orders,
    COUNT(DISTINCT o.cust_id) AS unique_customers,
    SUM(oi.quantity * oi.item_price) AS total_revenue
FROM Orders o
JOIN OrderItems oi ON o.order_num = oi.order_num
GROUP BY YEAR(o.order_date), MONTH(o.order_date);

Data Warehouse Views

-- Fact table view for analytics
CREATE VIEW SalesFactView AS
SELECT 
    o.order_date,
    c.cust_id,
    c.cust_state,
    p.prod_id,
    p.vend_id,
    oi.quantity,
    oi.item_price,
    oi.quantity * oi.item_price AS line_total
FROM Orders o
JOIN Customers c ON o.cust_id = c.cust_id
JOIN OrderItems oi ON o.order_num = oi.order_num
JOIN Products p ON oi.prod_id = p.prod_id;

Stored Procedures

Understanding Stored Procedures

Stored procedures are collections of one or more SQL statements saved for future use. Think of them as database "programs" that can execute complex operations with a single call.

Real-World Scenario: Order Processing

Consider processing a customer order:

  1. Check inventory: Verify items are in stock
  2. Reserve items: Update quantities to prevent overselling
  3. Handle backorders: Coordinate with vendors for out-of-stock items
  4. Notify customer: Send status updates on availability

This requires multiple SQL statements across many tables with conditional logic - perfect for a stored procedure.

Key Characteristics:

  • Batch operations: Execute multiple statements as one unit
  • Conditional logic: Include IF/ELSE, loops, and variables
  • Reusable code: Write once, use many times
  • Server-side execution: Run directly on database server

Why Use Stored Procedures

Primary Benefits:

1. Simplicity

  • Encapsulate complexity: Hide complex operations behind simple calls
  • Consistent interface: Same procedure works across applications
  • Reduced coding: No need to rewrite complex logic

2. Security

  • Access control: Grant procedure execution without table access
  • Data protection: Prevent direct table manipulation
  • Reduced attack surface: Limit SQL injection opportunities

3. Performance

  • Compiled code: Pre-compiled for faster execution
  • Reduced network traffic: Single call instead of multiple statements
  • Optimized execution: Database can optimize procedure plans

4. Data Consistency

  • Standardized operations: Everyone uses same business logic
  • Error prevention: Centralized validation and processing
  • Transaction control: Ensure data integrity across operations

5. Change Management

  • Centralized updates: Modify logic in one place
  • Version control: Track procedure changes over time
  • Backward compatibility: Maintain consistent interfaces

Drawbacks and Considerations

Challenges:

ChallengeDescriptionMitigation
PortabilitySyntax varies dramatically between DBMSsUse standard SQL where possible
ComplexityHarder to write than simple SQLInvest in training and documentation
DebuggingMore difficult to troubleshootUse comprehensive logging
Version controlDatabase objects harder to trackUse database migration tools
TestingUnit testing more complexDevelop testing frameworks

Database Compatibility:

-- ❌ SQLite: Does not support stored procedures
-- ✅ Oracle: Full support with PL/SQL
-- ✅ SQL Server: Full support with T-SQL  
-- ✅ MySQL: Supports stored procedures
-- ✅ PostgreSQL: Supports stored functions

Executing Stored Procedures

Basic EXECUTE Syntax

-- Basic execution
EXECUTE procedure_name;
 
-- With parameters
EXECUTE procedure_name(param1, param2, param3);

Example: Adding a New Product

-- Execute stored procedure with parameters
EXECUTE AddNewProduct('JTS01',
                      'Stuffed Eiffel Tower',
                      6.49,
                      'Plush stuffed toy with the text La Tour Eiffel in red white and blue');

What This Procedure Does:

  1. Validates input: Ensures all required parameters have values
  2. Generates ID: Creates unique primary key automatically
  3. Inserts data: Adds new product to Products table
  4. Returns result: Confirms successful creation

Parameter Types:

-- Input parameters (default)
EXECUTE GetCustomer(1000000001);
 
-- Output parameters  
EXECUTE GetCustomerCount(@count OUTPUT);
 
-- Input/Output parameters
EXECUTE UpdateInventory(@product_id, @quantity INOUT);

Execution Options:

Optional Parameters:

-- With default values
EXECUTE CreateOrder(@cust_id, @priority = 'NORMAL');

Named Parameters:

-- Out-of-order specification
EXECUTE CreateOrder(priority = 'HIGH', cust_id = 1000000001);

Return Values:

-- Capture return code
DECLARE @result INT;
EXECUTE @result = ValidateCustomer(1000000001);

Creating Stored Procedures

Oracle Syntax (PL/SQL)

-- Create procedure to count customers with email
CREATE PROCEDURE MailingListCount (
    ListCount OUT INTEGER
)
IS
    v_rows INTEGER;
BEGIN
    SELECT COUNT(*) INTO v_rows
    FROM Customers
    WHERE cust_email IS NOT NULL;
    
    ListCount := v_rows;
END;
/

Executing Oracle Procedure:

-- Declare variable for output
DECLARE
    ReturnValue NUMBER;
BEGIN
    -- Execute procedure
    MailingListCount(ReturnValue);
    
    -- Display result
    DBMS_OUTPUT.PUT_LINE('Count: ' || ReturnValue);
END;
/

SQL Server Syntax (T-SQL)

-- Create procedure to count customers with email
CREATE PROCEDURE MailingListCount
AS
BEGIN
    DECLARE @cnt INTEGER;
    
    SELECT @cnt = COUNT(*)
    FROM Customers
    WHERE cust_email IS NOT NULL;
    
    RETURN @cnt;
END;

Executing SQL Server Procedure:

-- Declare variable for return value
DECLARE @ReturnValue INT;
 
-- Execute procedure and capture return value
EXECUTE @ReturnValue = MailingListCount;
 
-- Display result
SELECT @ReturnValue AS CustomerCount;

Advanced Stored Procedure Examples

Order Creation Procedure (SQL Server)

CREATE PROCEDURE NewOrder @cust_id CHAR(10)
AS
BEGIN
    -- Declare variable for order number
    DECLARE @order_num INTEGER;
    
    -- Get current highest order number
    SELECT @order_num = MAX(order_num)
    FROM Orders;
    
    -- Determine next order number
    SELECT @order_num = @order_num + 1;
    
    -- Insert new order
    INSERT INTO Orders(order_num, order_date, cust_id)
    VALUES(@order_num, GETDATE(), @cust_id);
    
    -- Return order number
    RETURN @order_num;
END;

Using Auto-Generated IDs:

CREATE PROCEDURE NewOrderAuto @cust_id CHAR(10)
AS
BEGIN
    -- Insert new order (order_num auto-generated)
    INSERT INTO Orders(cust_id, order_date)
    VALUES(@cust_id, GETDATE());
    
    -- Return generated order number
    SELECT order_num = @@IDENTITY;
END;

Complex Business Logic Example:

CREATE PROCEDURE ProcessOrder 
    @cust_id CHAR(10),
    @prod_id CHAR(10),
    @quantity INTEGER,
    @order_status VARCHAR(20) OUTPUT
AS
BEGIN
    DECLARE @available_qty INTEGER;
    DECLARE @order_num INTEGER;
    
    -- Check inventory
    SELECT @available_qty = quantity_in_stock
    FROM Products
    WHERE prod_id = @prod_id;
    
    -- Validate inventory
    IF @available_qty >= @quantity
    BEGIN
        -- Create order
        EXECUTE @order_num = NewOrder @cust_id;
        
        -- Add order item
        INSERT INTO OrderItems(order_num, prod_id, quantity, item_price)
        SELECT @order_num, @prod_id, @quantity, prod_price
        FROM Products
        WHERE prod_id = @prod_id;
        
        -- Update inventory
        UPDATE Products
        SET quantity_in_stock = quantity_in_stock - @quantity
        WHERE prod_id = @prod_id;
        
        SET @order_status = 'CONFIRMED';
    END
    ELSE
    BEGIN
        SET @order_status = 'BACKORDERED';
    END;
    
    RETURN @order_num;
END;

Parameter Types and Usage

Input Parameters (IN)

-- Oracle
CREATE PROCEDURE GetCustomerOrders(
    p_cust_id IN VARCHAR2
)
IS
BEGIN
    SELECT * FROM Orders
    WHERE cust_id = p_cust_id;
END;
 
-- SQL Server  
CREATE PROCEDURE GetCustomerOrders
    @cust_id VARCHAR(10)
AS
BEGIN
    SELECT * FROM Orders
    WHERE cust_id = @cust_id;
END;

Output Parameters (OUT)

-- Oracle
CREATE PROCEDURE GetOrderTotal(
    p_order_num IN INTEGER,
    p_total OUT NUMBER
)
IS
BEGIN
    SELECT SUM(quantity * item_price) INTO p_total
    FROM OrderItems
    WHERE order_num = p_order_num;
END;
 
-- SQL Server
CREATE PROCEDURE GetOrderTotal
    @order_num INTEGER,
    @total MONEY OUTPUT
AS
BEGIN
    SELECT @total = SUM(quantity * item_price)
    FROM OrderItems
    WHERE order_num = @order_num;
END;

Input/Output Parameters (INOUT)

-- Oracle
CREATE PROCEDURE AdjustInventory(
    p_prod_id IN VARCHAR2,
    p_quantity IN OUT INTEGER
)
IS
    v_current_qty INTEGER;
BEGIN
    -- Get current quantity
    SELECT quantity_in_stock INTO v_current_qty
    FROM Products
    WHERE prod_id = p_prod_id;
    
    -- Calculate new quantity
    p_quantity := v_current_qty + p_quantity;
    
    -- Update inventory
    UPDATE Products
    SET quantity_in_stock = p_quantity
    WHERE prod_id = p_prod_id;
END;

Error Handling in Stored Procedures

SQL Server Error Handling:

CREATE PROCEDURE SafeOrderCreation
    @cust_id CHAR(10),
    @result VARCHAR(50) OUTPUT
AS
BEGIN
    BEGIN TRY
        -- Start transaction
        BEGIN TRANSACTION;
        
        -- Validate customer exists
        IF NOT EXISTS (SELECT 1 FROM Customers WHERE cust_id = @cust_id)
        BEGIN
            SET @result = 'ERROR: Customer not found';
            RETURN -1;
        END;
        
        -- Create order
        INSERT INTO Orders(cust_id, order_date)
        VALUES(@cust_id, GETDATE());
        
        -- Commit transaction
        COMMIT TRANSACTION;
        SET @result = 'SUCCESS: Order created';
        RETURN @@IDENTITY;
        
    END TRY
    BEGIN CATCH
        -- Rollback on error
        IF @@TRANCOUNT > 0
            ROLLBACK TRANSACTION;
            
        SET @result = 'ERROR: ' + ERROR_MESSAGE();
        RETURN -1;
    END CATCH;
END;

Oracle Error Handling:

CREATE PROCEDURE SafeOrderCreation(
    p_cust_id IN VARCHAR2,
    p_order_num OUT INTEGER,
    p_result OUT VARCHAR2
)
IS
    v_count INTEGER;
BEGIN
    -- Validate customer exists
    SELECT COUNT(*) INTO v_count
    FROM Customers
    WHERE cust_id = p_cust_id;
    
    IF v_count = 0 THEN
        p_result := 'ERROR: Customer not found';
        p_order_num := -1;
        RETURN;
    END IF;
    
    -- Create order
    INSERT INTO Orders(order_num, cust_id, order_date)
    VALUES(order_seq.NEXTVAL, p_cust_id, SYSDATE)
    RETURNING order_num INTO p_order_num;
    
    p_result := 'SUCCESS: Order created';
    
EXCEPTION
    WHEN OTHERS THEN
        p_result := 'ERROR: ' || SQLERRM;
        p_order_num := -1;
        ROLLBACK;
END;

Best Practices

1. Code Documentation

-- ✅ Good: Well-documented procedure
/*
Purpose: Create new customer order with validation
Parameters:
  @cust_id - Customer ID (required)
  @priority - Order priority (optional, default 'NORMAL')
Returns: Order number if successful, -1 if error
Author: Database Team
Created: 2024-01-15
Modified: 2024-01-20 - Added priority parameter
*/
CREATE PROCEDURE CreateCustomerOrder
    @cust_id CHAR(10),
    @priority VARCHAR(10) = 'NORMAL'
AS
BEGIN
    -- Validate customer exists
    IF NOT EXISTS (SELECT 1 FROM Customers WHERE cust_id = @cust_id)
    BEGIN
        RETURN -1; -- Customer not found
    END;
    
    -- Create order with specified priority
    INSERT INTO Orders(cust_id, order_date, priority)
    VALUES(@cust_id, GETDATE(), @priority);
    
    -- Return new order number
    RETURN @@IDENTITY;
END;

2. Parameter Validation

CREATE PROCEDURE ValidatedOrderCreation
    @cust_id CHAR(10),
    @prod_id CHAR(10),
    @quantity INTEGER
AS
BEGIN
    -- Input validation
    IF @cust_id IS NULL OR LEN(@cust_id) = 0
    BEGIN
        RAISERROR('Customer ID is required', 16, 1);
        RETURN -1;
    END;
    
    IF @quantity <= 0
    BEGIN
        RAISERROR('Quantity must be positive', 16, 1);
        RETURN -1;
    END;
    
    -- Business logic continues...
END;

3. Transaction Management

CREATE PROCEDURE AtomicOrderProcess
    @cust_id CHAR(10),
    @items OrderItemsTableType READONLY
AS
BEGIN
    BEGIN TRANSACTION;
    
    BEGIN TRY
        -- Create order header
        INSERT INTO Orders(cust_id, order_date)
        VALUES(@cust_id, GETDATE());
        
        DECLARE @order_num INTEGER = @@IDENTITY;
        
        -- Add order items
        INSERT INTO OrderItems(order_num, prod_id, quantity, item_price)
        SELECT @order_num, prod_id, quantity, price
        FROM @items;
        
        -- Update inventory
        UPDATE p SET quantity_in_stock = quantity_in_stock - i.quantity
        FROM Products p
        INNER JOIN @items i ON p.prod_id = i.prod_id;
        
        COMMIT TRANSACTION;
        RETURN @order_num;
        
    END TRY
    BEGIN CATCH
        ROLLBACK TRANSACTION;
        THROW;
    END CATCH;
END;

4. Performance Optimization

-- ✅ Good: Efficient procedure
CREATE PROCEDURE GetCustomerOrderSummary
    @cust_id CHAR(10)
AS
BEGIN
    -- Use SET NOCOUNT to reduce network traffic
    SET NOCOUNT ON;
    
    -- Single query instead of multiple lookups
    SELECT 
        c.cust_name,
        COUNT(o.order_num) AS total_orders,
        SUM(oi.quantity * oi.item_price) AS total_spent,
        MAX(o.order_date) AS last_order_date
    FROM Customers c
    LEFT JOIN Orders o ON c.cust_id = o.cust_id
    LEFT JOIN OrderItems oi ON o.order_num = oi.order_num
    WHERE c.cust_id = @cust_id
    GROUP BY c.cust_id, c.cust_name;
END;

Common Stored Procedure Patterns

CRUD Operations

-- Create
CREATE PROCEDURE CreateCustomer
    @cust_name VARCHAR(50),
    @cust_email VARCHAR(100),
    @cust_id CHAR(10) OUTPUT
AS
BEGIN
    -- Generate new customer ID
    SELECT @cust_id = 'C' + RIGHT('000000000' + CAST(NEXT VALUE FOR cust_seq AS VARCHAR), 9);
    
    INSERT INTO Customers(cust_id, cust_name, cust_email)
    VALUES(@cust_id, @cust_name, @cust_email);
END;
 
-- Read
CREATE PROCEDURE GetCustomer
    @cust_id CHAR(10)
AS
BEGIN
    SELECT * FROM Customers
    WHERE cust_id = @cust_id;
END;
 
-- Update
CREATE PROCEDURE UpdateCustomer
    @cust_id CHAR(10),
    @cust_name VARCHAR(50),
    @cust_email VARCHAR(100)
AS
BEGIN
    UPDATE Customers
    SET cust_name = @cust_name,
        cust_email = @cust_email,
        modified_date = GETDATE()
    WHERE cust_id = @cust_id;
END;
 
-- Delete
CREATE PROCEDURE DeleteCustomer
    @cust_id CHAR(10)
AS
BEGIN
    DELETE FROM Customers
    WHERE cust_id = @cust_id;
END;

Reporting Procedures

CREATE PROCEDURE SalesReport
    @start_date DATE,
    @end_date DATE
AS
BEGIN
    SELECT 
        p.prod_name,
        SUM(oi.quantity) AS units_sold,
        SUM(oi.quantity * oi.item_price) AS revenue,
        AVG(oi.item_price) AS avg_price
    FROM Orders o
    INNER JOIN OrderItems oi ON o.order_num = oi.order_num
    INNER JOIN Products p ON oi.prod_id = p.prod_id
    WHERE o.order_date BETWEEN @start_date AND @end_date
    GROUP BY p.prod_id, p.prod_name
    ORDER BY revenue DESC;
END;

Transaction Processing

Understanding Transaction Processing

Transaction processing maintains database integrity by ensuring that batches of SQL operations execute completely or not at all. This prevents partial operations from corrupting your data.

The Problem: Partial Operations

Consider adding an order to a system with multiple related tables:

  1. Check/add customer to Customers table
  2. Retrieve customer ID for reference
  3. Add order to Orders table
  4. Retrieve order ID for reference
  5. Add order items to OrderItems table

What happens if the system fails partway through?

Failure PointResultProblem
After customer, before orderCustomer exists, no order✅ Acceptable - can retry
After order, before itemsEmpty order in database❌ Orphaned order record
During item insertionPartial order❌ Incomplete, unknown state

The Solution: Transaction Processing

-- All operations succeed together, or all fail together
BEGIN TRANSACTION
    -- Add customer
    -- Add order  
    -- Add order items
COMMIT TRANSACTION  -- Save all changes
 
-- OR if any step fails:
ROLLBACK TRANSACTION  -- Undo all changes

Key Transaction Concepts

Core Terminology:

  • Transaction: Block of SQL statements treated as single unit
  • Rollback: Process of undoing specified SQL statements
  • Commit: Writing unsaved SQL statements to database tables
  • Savepoint: Temporary placeholder for partial rollbacks

ACID Properties:

PropertyDescriptionBenefit
AtomicityAll or nothing executionPrevents partial updates
ConsistencyDatabase rules maintainedData integrity preserved
IsolationTransactions don't interfereConcurrent access safe
DurabilityCommitted changes persistData survives system failures

Statement Compatibility:

-- ✅ Can be rolled back
INSERT, UPDATE, DELETE
 
-- ❌ Cannot be rolled back  
SELECT (no point anyway)
CREATE, DROP (structural changes)

Controlling Transactions

Database-Specific Syntax

SQL Server:

BEGIN TRANSACTION
    -- SQL statements here
COMMIT TRANSACTION

MySQL/MariaDB:

START TRANSACTION
    -- SQL statements here
COMMIT

Oracle:

SET TRANSACTION
    -- SQL statements here
COMMIT

PostgreSQL (ANSI SQL):

BEGIN
    -- SQL statements here
COMMIT

Transaction Lifecycle:

-- 1. Start transaction
BEGIN TRANSACTION
 
-- 2. Execute operations
INSERT INTO table1 VALUES (...);
UPDATE table2 SET ...;
DELETE FROM table3 WHERE ...;
 
-- 3. End transaction (choose one)
COMMIT TRANSACTION    -- Save all changes
-- OR
ROLLBACK TRANSACTION  -- Undo all changes

Using ROLLBACK

Basic Rollback Example

-- Undo a DELETE operation
DELETE FROM Orders;
ROLLBACK;
-- All deleted rows are restored

Conditional Rollback

BEGIN TRANSACTION
    UPDATE Products 
    SET quantity_in_stock = quantity_in_stock - 10
    WHERE prod_id = 'BR01';
    
    -- Check if update was valid
    IF (SELECT quantity_in_stock FROM Products WHERE prod_id = 'BR01') < 0
    BEGIN
        ROLLBACK TRANSACTION;  -- Undo if negative inventory
        PRINT 'Insufficient inventory';
    END
    ELSE
    BEGIN
        COMMIT TRANSACTION;    -- Save if valid
        PRINT 'Inventory updated';
    END

Rollback Scenarios:

-- Error handling with rollback
BEGIN TRANSACTION
    INSERT INTO Customers(cust_id, cust_name) 
    VALUES('C001', 'New Customer');
    
    -- Simulate error condition
    IF @@ERROR <> 0
    BEGIN
        ROLLBACK TRANSACTION;
        RETURN;
    END
    
    INSERT INTO Orders(order_num, cust_id, order_date)
    VALUES(1001, 'C001', GETDATE());
    
    IF @@ERROR <> 0
    BEGIN
        ROLLBACK TRANSACTION;
        RETURN;
    END
    
COMMIT TRANSACTION;

Using COMMIT

Implicit vs Explicit Commits

-- Implicit commit (automatic)
INSERT INTO Products VALUES('P001', 'New Product', 19.99);
-- Automatically saved to database
 
-- Explicit commit (manual control)
BEGIN TRANSACTION
    INSERT INTO Products VALUES('P002', 'Another Product', 29.99);
    -- Not yet saved to database
COMMIT TRANSACTION;
-- Now saved to database

Complete Order Processing Example

-- SQL Server: Complete order transaction
BEGIN TRANSACTION
    -- Delete order items first (foreign key constraint)
    DELETE FROM OrderItems WHERE order_num = 12345;
    
    -- Then delete the order
    DELETE FROM Orders WHERE order_num = 12345;
    
    -- Only commit if both operations succeeded
COMMIT TRANSACTION;

Oracle Equivalent:

SET TRANSACTION
    DELETE FROM OrderItems WHERE order_num = 12345;
    DELETE FROM Orders WHERE order_num = 12345;
COMMIT;

Benefits of Explicit Commits:

  1. Control timing: Decide exactly when changes are saved
  2. Batch operations: Group related changes together
  3. Error recovery: Ability to undo on problems
  4. Performance: Reduce disk I/O by batching writes

Using Savepoints

Creating Savepoints

-- SQL Server
SAVE TRANSACTION savepoint_name;
 
-- MySQL/MariaDB/Oracle  
SAVEPOINT savepoint_name;

Rolling Back to Savepoints

-- SQL Server
ROLLBACK TRANSACTION savepoint_name;
 
-- MySQL/MariaDB/Oracle
ROLLBACK TO savepoint_name;

Complex Transaction with Savepoints

-- SQL Server: Order processing with savepoints
BEGIN TRANSACTION
 
    -- Add customer (always keep this)
    INSERT INTO Customers(cust_id, cust_name)
    VALUES(1000000010, 'Toys Emporium');
    
    -- Create savepoint after customer
    SAVE TRANSACTION StartOrder;
    
    -- Add order
    INSERT INTO Orders(order_num, order_date, cust_id)
    VALUES(20100, '2020/12/1', 1000000010);
    
    IF @@ERROR <> 0 
        ROLLBACK TRANSACTION StartOrder;  -- Keep customer, undo order
    
    -- Add first order item
    INSERT INTO OrderItems(order_num, order_item, prod_id, quantity, item_price)
    VALUES(20100, 1, 'BR01', 100, 5.49);
    
    IF @@ERROR <> 0 
        ROLLBACK TRANSACTION StartOrder;  -- Keep customer, undo order & items
    
    -- Add second order item  
    INSERT INTO OrderItems(order_num, order_item, prod_id, quantity, item_price)
    VALUES(20100, 2, 'BR03', 100, 10.99);
    
    IF @@ERROR <> 0 
        ROLLBACK TRANSACTION StartOrder;  -- Keep customer, undo order & items
    
COMMIT TRANSACTION;  -- Save everything if no errors

Multiple Savepoints Strategy:

BEGIN TRANSACTION
    -- Customer operations
    INSERT INTO Customers(...);
    SAVE TRANSACTION customer_added;
    
    -- Order operations  
    INSERT INTO Orders(...);
    SAVE TRANSACTION order_added;
    
    -- Order item operations
    INSERT INTO OrderItems(...);
    SAVE TRANSACTION item1_added;
    
    INSERT INTO OrderItems(...);
    SAVE TRANSACTION item2_added;
    
    -- Can rollback to any savepoint as needed
    -- ROLLBACK TRANSACTION item1_added;  -- Keeps customer & order
    -- ROLLBACK TRANSACTION order_added;   -- Keeps only customer
    
COMMIT TRANSACTION;

Advanced Transaction Patterns

Nested Transaction Simulation

-- Simulate nested transactions with savepoints
BEGIN TRANSACTION MainTransaction
    
    INSERT INTO Orders(...);
    SAVE TRANSACTION OrderSavepoint;
    
    -- "Nested" transaction for order items
    BEGIN TRY
        INSERT INTO OrderItems(...);
        INSERT INTO OrderItems(...);
        INSERT INTO OrderItems(...);
        -- "Nested commit" - just continue
    END TRY
    BEGIN CATCH
        -- "Nested rollback" - rollback to savepoint
        ROLLBACK TRANSACTION OrderSavepoint;
        THROW;  -- Re-raise error
    END CATCH
    
COMMIT TRANSACTION MainTransaction;

Batch Processing with Transactions

-- Process large datasets in batches
DECLARE @BatchSize INT = 1000;
DECLARE @RowsProcessed INT = 0;
 
WHILE EXISTS (SELECT 1 FROM TempTable WHERE processed = 0)
BEGIN
    BEGIN TRANSACTION
        
        -- Process batch
        UPDATE TOP(@BatchSize) TempTable 
        SET processed = 1, 
            processed_date = GETDATE()
        WHERE processed = 0;
        
        SET @RowsProcessed = @@ROWCOUNT;
        
        -- Commit batch
        COMMIT TRANSACTION;
        
        PRINT CAST(@RowsProcessed AS VARCHAR) + ' rows processed';
        
        -- Brief pause to avoid blocking
        WAITFOR DELAY '00:00:01';
END

Error Handling Patterns

Try-Catch with Transactions (SQL Server)

BEGIN TRANSACTION
BEGIN TRY
    -- Risky operations
    INSERT INTO Orders(order_num, cust_id, order_date)
    VALUES(20100, 'INVALID_CUSTOMER', GETDATE());
    
    INSERT INTO OrderItems(order_num, prod_id, quantity)
    VALUES(20100, 'INVALID_PRODUCT', 5);
    
    -- If we get here, commit
    COMMIT TRANSACTION;
    PRINT 'Transaction completed successfully';
    
END TRY
BEGIN CATCH
    -- Error occurred, rollback
    IF @@TRANCOUNT > 0
        ROLLBACK TRANSACTION;
    
    -- Report error details
    PRINT 'Error: ' + ERROR_MESSAGE();
    PRINT 'Transaction rolled back';
END CATCH

Manual Error Checking

BEGIN TRANSACTION
    DECLARE @ErrorCount INT = 0;
    
    INSERT INTO Customers(cust_id, cust_name)
    VALUES('C001', 'Test Customer');
    SET @ErrorCount = @ErrorCount + @@ERROR;
    
    INSERT INTO Orders(order_num, cust_id, order_date)  
    VALUES(1001, 'C001', GETDATE());
    SET @ErrorCount = @ErrorCount + @@ERROR;
    
    INSERT INTO OrderItems(order_num, prod_id, quantity, item_price)
    VALUES(1001, 'P001', 2, 19.99);
    SET @ErrorCount = @ErrorCount + @@ERROR;
    
    -- Check for any errors
    IF @ErrorCount = 0
    BEGIN
        COMMIT TRANSACTION;
        PRINT 'All operations successful';
    END
    ELSE
    BEGIN
        ROLLBACK TRANSACTION;
        PRINT 'Errors occurred, transaction rolled back';
    END

Performance Considerations

Transaction Size Management

-- ❌ Bad: Very large transaction
BEGIN TRANSACTION
    -- Process 1 million rows in single transaction
    UPDATE LargeTable SET status = 'PROCESSED';  -- Locks entire table
COMMIT TRANSACTION;
 
-- ✅ Good: Batch processing
DECLARE @BatchSize INT = 10000;
WHILE EXISTS (SELECT 1 FROM LargeTable WHERE status = 'PENDING')
BEGIN
    BEGIN TRANSACTION
        UPDATE TOP(@BatchSize) LargeTable 
        SET status = 'PROCESSED'
        WHERE status = 'PENDING';
    COMMIT TRANSACTION;
    
    -- Allow other transactions to proceed
    WAITFOR DELAY '00:00:00.100';
END

Lock Duration Minimization

-- ✅ Good: Minimize lock time
BEGIN TRANSACTION
    -- Quick operations only
    DECLARE @order_num INT;
    
    INSERT INTO Orders(cust_id, order_date)
    VALUES('C001', GETDATE());
    
    SET @order_num = @@IDENTITY;
    
COMMIT TRANSACTION;  -- Release locks quickly
 
-- Separate transaction for bulk operations
BEGIN TRANSACTION
    INSERT INTO OrderItems(order_num, prod_id, quantity, item_price)
    SELECT @order_num, prod_id, quantity, price
    FROM TempOrderItems;
COMMIT TRANSACTION;

Best Practices

1. Keep Transactions Short

-- ✅ Good: Short, focused transaction
BEGIN TRANSACTION
    UPDATE Inventory SET quantity = quantity - 1 WHERE prod_id = 'P001';
    INSERT INTO Sales(prod_id, quantity, sale_date) VALUES('P001', 1, GETDATE());
COMMIT TRANSACTION;
 
-- ❌ Bad: Long-running transaction
BEGIN TRANSACTION
    -- Complex calculations
    -- File I/O operations  
    -- Network calls
    -- Multiple large table scans
COMMIT TRANSACTION;  -- Holds locks too long

2. Use Appropriate Isolation Levels

-- For read-heavy operations
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
BEGIN TRANSACTION
    SELECT COUNT(*) FROM LargeTable;  -- Dirty read OK for counts
COMMIT TRANSACTION;
 
-- For critical updates
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION
    -- Critical financial operations
COMMIT TRANSACTION;

3. Handle Deadlocks Gracefully

DECLARE @RetryCount INT = 0;
DECLARE @MaxRetries INT = 3;
 
WHILE @RetryCount < @MaxRetries
BEGIN
    BEGIN TRY
        BEGIN TRANSACTION
            -- Operations that might deadlock
            UPDATE Account SET balance = balance - 100 WHERE account_id = 1;
            UPDATE Account SET balance = balance + 100 WHERE account_id = 2;
        COMMIT TRANSACTION;
        
        BREAK;  -- Success, exit loop
        
    END TRY
    BEGIN CATCH
        IF @@TRANCOUNT > 0
            ROLLBACK TRANSACTION;
            
        IF ERROR_NUMBER() = 1205  -- Deadlock
        BEGIN
            SET @RetryCount = @RetryCount + 1;
            WAITFOR DELAY '00:00:01';  -- Wait before retry
            CONTINUE;
        END
        ELSE
        BEGIN
            THROW;  -- Re-raise non-deadlock errors
        END
    END CATCH
END

Common Transaction Patterns

Financial Transfer

-- Money transfer between accounts
BEGIN TRANSACTION
    DECLARE @source_balance MONEY;
    DECLARE @transfer_amount MONEY = 500.00;
    
    -- Check source account balance
    SELECT @source_balance = balance 
    FROM Accounts 
    WHERE account_id = 'ACC001';
    
    IF @source_balance < @transfer_amount
    BEGIN
        ROLLBACK TRANSACTION;
        RAISERROR('Insufficient funds', 16, 1);
        RETURN;
    END
    
    -- Perform transfer
    UPDATE Accounts 
    SET balance = balance - @transfer_amount 
    WHERE account_id = 'ACC001';
    
    UPDATE Accounts 
    SET balance = balance + @transfer_amount 
    WHERE account_id = 'ACC002';
    
    -- Log transaction
    INSERT INTO TransactionLog(from_account, to_account, amount, transaction_date)
    VALUES('ACC001', 'ACC002', @transfer_amount, GETDATE());
    
COMMIT TRANSACTION;

Inventory Management

-- Reserve inventory for order
BEGIN TRANSACTION
    DECLARE @available_qty INT;
    DECLARE @order_qty INT = 5;
    
    -- Check availability
    SELECT @available_qty = quantity_available
    FROM Inventory
    WHERE prod_id = 'P001';
    
    IF @available_qty < @order_qty
    BEGIN
        ROLLBACK TRANSACTION;
        RAISERROR('Insufficient inventory', 16, 1);
        RETURN;
    END
    
    -- Reserve inventory
    UPDATE Inventory
    SET quantity_available = quantity_available - @order_qty,
        quantity_reserved = quantity_reserved + @order_qty
    WHERE prod_id = 'P001';
    
    -- Create order
    INSERT INTO Orders(cust_id, order_date, status)
    VALUES('C001', GETDATE(), 'PENDING');
    
    INSERT INTO OrderItems(order_num, prod_id, quantity)
    VALUES(@@IDENTITY, 'P001', @order_qty);
    
COMMIT TRANSACTION;

Key Takeaways

  1. Transactions ensure data integrity by treating multiple operations as atomic units
  2. Use transactions for related operations that must succeed or fail together
  3. Keep transactions short to minimize lock duration and improve concurrency
  4. Use savepoints for complex transactions requiring partial rollback capability
  5. Always include error handling with appropriate rollback logic
  6. Consider performance implications of transaction size and isolation levels
  7. Test transaction logic thoroughly including error and rollback scenarios
  8. Understand your DBMS-specific syntax as implementations vary significantly
  9. Use batch processing for large datasets instead of single massive transactions
  10. Plan for deadlock handling in multi-user environments

Cursors

Understanding Cursors

Cursors enable row-by-row processing of SQL result sets, allowing applications to step through data one record at a time rather than processing entire result sets at once.

The Problem with Standard SQL

Standard SQL retrieval operations work with result sets - complete collections of rows returned by queries:

-- Returns entire result set at once
SELECT * FROM Customers WHERE cust_state = 'CA';
-- No way to get "first row", "next row", or "previous 10 rows"

What Cursors Provide:

  • Sequential access: Move forward/backward through rows
  • Positional control: Jump to specific row positions
  • Row-by-row processing: Handle one record at a time
  • Interactive browsing: Support user navigation through data

Key Characteristics:

FeatureDescriptionBenefit
Server-side storageQuery results stored on DBMS serverEfficient memory usage
Scrollable navigationMove in any direction through dataFlexible data access
Stateful processingMaintains current positionResumable operations
Controlled fetchingRetrieve specific rows on demandReduced network traffic

Cursor Capabilities

Common Cursor Features:

1. Access Control

-- Read-only cursor
DECLARE ReadOnlyCursor CURSOR FOR
SELECT * FROM Products;
 
-- Updatable cursor (where supported)
DECLARE UpdatableCursor CURSOR FOR
SELECT * FROM Products FOR UPDATE;

2. Directional Movement

  • Forward: Next row(s)
  • Backward: Previous row(s)
  • Absolute: Specific row number
  • Relative: Move by offset from current position
  • First/Last: Jump to beginning/end

3. Column Editability

-- Some columns editable, others read-only
DECLARE MixedCursor CURSOR FOR
SELECT prod_id,        -- Read-only (primary key)
       prod_name,      -- Editable
       prod_price,     -- Editable  
       created_date    -- Read-only (system field)
FROM Products;

4. Scope Control

  • Local: Available only to creating procedure
  • Global: Available to entire session
  • Connection: Available across connection

5. Data Consistency

-- Static cursor (snapshot of data)
DECLARE StaticCursor CURSOR STATIC FOR
SELECT * FROM Orders;
 
-- Dynamic cursor (live data)
DECLARE DynamicCursor CURSOR DYNAMIC FOR  
SELECT * FROM Orders;

Working with Cursors

Cursor Lifecycle:

  1. DECLARE: Define cursor and SELECT statement
  2. OPEN: Execute query and populate cursor
  3. FETCH: Retrieve individual rows as needed
  4. CLOSE: Release cursor resources
  5. DEALLOCATE: Free cursor memory (some DBMSs)
-- Complete cursor lifecycle
DECLARE MyCursor CURSOR FOR SELECT * FROM Customers;  -- 1. Declare
OPEN MyCursor;                                        -- 2. Open
FETCH NEXT FROM MyCursor INTO @variables;             -- 3. Fetch
CLOSE MyCursor;                                       -- 4. Close
DEALLOCATE MyCursor;                                  -- 5. Deallocate

Creating Cursors

Database-Specific Syntax

SQL Server/MySQL/MariaDB/DB2:

DECLARE CustCursor CURSOR
FOR
SELECT * FROM Customers
WHERE cust_email IS NULL;

Oracle/PostgreSQL:

DECLARE CURSOR CustCursor
IS
SELECT * FROM Customers
WHERE cust_email IS NULL;

Advanced Cursor Declaration:

-- SQL Server: Cursor with options
DECLARE AdvancedCursor CURSOR
    STATIC          -- Snapshot of data
    SCROLL          -- Allow backward movement
    READ_ONLY       -- Prevent updates
FOR
SELECT cust_id, cust_name, cust_email
FROM Customers
WHERE cust_state = 'CA'
ORDER BY cust_name;

Cursor Options:

OptionDescriptionUse Case
STATICSnapshot of data at open timeConsistent view during processing
DYNAMICLive data, sees changesReal-time data processing
FORWARD_ONLYCan only move forwardSimple sequential processing
SCROLLCan move in any directionInteractive browsing
READ_ONLYCannot update through cursorData analysis and reporting
FOR UPDATECan update current rowData modification scenarios

Opening and Using Cursors

Opening Cursors

-- Universal syntax
OPEN CURSOR CustCursor;
 
-- Or simply
OPEN CustCursor;

When OPEN executes:

  1. Query execution: SELECT statement runs
  2. Result population: Data stored in cursor
  3. Position initialization: Cursor positioned before first row
  4. Resource allocation: Memory allocated for result set

Fetching Data

Oracle Example:

DECLARE 
    TYPE CustCursor IS REF CURSOR RETURN Customers%ROWTYPE;
    cursor_var CustCursor;
    CustRecord Customers%ROWTYPE;
BEGIN
    OPEN cursor_var FOR
        SELECT * FROM Customers WHERE cust_email IS NULL;
    
    -- Fetch single row
    FETCH cursor_var INTO CustRecord;
    
    -- Process the record
    DBMS_OUTPUT.PUT_LINE('Customer: ' || CustRecord.cust_name);
    
    CLOSE cursor_var;
END;
/

Oracle Loop Processing:

DECLARE 
    TYPE CustCursor IS REF CURSOR RETURN Customers%ROWTYPE;
    cursor_var CustCursor;
    CustRecord Customers%ROWTYPE;
BEGIN
    OPEN cursor_var FOR
        SELECT * FROM Customers WHERE cust_email IS NULL;
    
    LOOP
        FETCH cursor_var INTO CustRecord;
        EXIT WHEN cursor_var%NOTFOUND;  -- Exit when no more rows
        
        -- Process each customer
        DBMS_OUTPUT.PUT_LINE('Processing: ' || CustRecord.cust_name);
        
        -- Add business logic here
        -- Update customer record, send email, etc.
        
    END LOOP;
    
    CLOSE cursor_var;
END;
/

SQL Server Example:

-- Declare variables for each column
DECLARE @cust_id CHAR(10),
        @cust_name CHAR(50),
        @cust_address CHAR(50),
        @cust_city CHAR(50),
        @cust_state CHAR(5),
        @cust_zip CHAR(10),
        @cust_country CHAR(50),
        @cust_contact CHAR(50),
        @cust_email CHAR(255);
 
-- Declare cursor
DECLARE CustCursor CURSOR FOR
SELECT * FROM Customers WHERE cust_email IS NULL;
 
-- Open cursor
OPEN CustCursor;
 
-- Fetch first row
FETCH NEXT FROM CustCursor
INTO @cust_id, @cust_name, @cust_address,
     @cust_city, @cust_state, @cust_zip,
     @cust_country, @cust_contact, @cust_email;
 
-- Process all rows
WHILE @@FETCH_STATUS = 0
BEGIN
    -- Process current customer
    PRINT 'Processing customer: ' + @cust_name;
    
    -- Add business logic here
    -- Update records, generate reports, etc.
    
    -- Fetch next row
    FETCH NEXT FROM CustCursor
    INTO @cust_id, @cust_name, @cust_address,
         @cust_city, @cust_state, @cust_zip,
         @cust_country, @cust_contact, @cust_email;
END;
 
-- Clean up
CLOSE CustCursor;
DEALLOCATE CustCursor;

Fetch Status Values:

| Status | SQL Server | Oracle | Meaning |-----|-----|----- | Success | @@FETCH_STATUS = 0 | cursor%FOUND | Row retrieved successfully | No data | @@FETCH_STATUS = -1 | cursor%NOTFOUND | No more rows available | Missing row | @@FETCH_STATUS = -2 | N/A | Row was deleted

Advanced Cursor Operations

Scrollable Cursors (SQL Server)

DECLARE ScrollCursor CURSOR SCROLL FOR
SELECT cust_id, cust_name FROM Customers
ORDER BY cust_name;
 
OPEN ScrollCursor;
 
-- Move to different positions
FETCH FIRST FROM ScrollCursor;      -- First row
FETCH LAST FROM ScrollCursor;       -- Last row  
FETCH ABSOLUTE 5 FROM ScrollCursor; -- 5th row
FETCH RELATIVE -2 FROM ScrollCursor; -- 2 rows back
FETCH NEXT FROM ScrollCursor;       -- Next row
FETCH PRIOR FROM ScrollCursor;      -- Previous row
 
CLOSE ScrollCursor;
DEALLOCATE ScrollCursor;

Updatable Cursors

-- SQL Server: Update through cursor
DECLARE UpdCursor CURSOR FOR
SELECT cust_id, cust_email 
FROM Customers
WHERE cust_email IS NULL
FOR UPDATE OF cust_email;
 
OPEN UpdCursor;
 
DECLARE @cust_id CHAR(10), @cust_email CHAR(255);
 
FETCH NEXT FROM UpdCursor INTO @cust_id, @cust_email;
 
WHILE @@FETCH_STATUS = 0
BEGIN
    -- Generate email address
    SET @cust_email = LOWER(@cust_id) + '@example.com';
    
    -- Update current row
    UPDATE Customers 
    SET cust_email = @cust_email
    WHERE CURRENT OF UpdCursor;
    
    FETCH NEXT FROM UpdCursor INTO @cust_id, @cust_email;
END;
 
CLOSE UpdCursor;
DEALLOCATE UpdCursor;

Cursor with Parameters (SQL Server)

-- Parameterized cursor
DECLARE StateCursor CURSOR FOR
SELECT cust_id, cust_name, cust_city
FROM Customers  
WHERE cust_state = @state_param
ORDER BY cust_city;
 
-- Use with different states
DECLARE @state_param CHAR(2) = 'CA';
OPEN StateCursor;
-- Process California customers
CLOSE StateCursor;
 
SET @state_param = 'NY';  
OPEN StateCursor;
-- Process New York customers  
CLOSE StateCursor;
 
DEALLOCATE StateCursor;

Practical Cursor Examples

Data Migration with Validation

-- Migrate customer data with validation
DECLARE MigrationCursor CURSOR FOR
SELECT cust_id, cust_name, cust_email, cust_phone
FROM LegacyCustomers
WHERE migration_status IS NULL;
 
OPEN MigrationCursor;
 
DECLARE @cust_id CHAR(10), @cust_name VARCHAR(50), 
        @cust_email VARCHAR(100), @cust_phone VARCHAR(20);
DECLARE @error_count INT = 0;
 
FETCH NEXT FROM MigrationCursor 
INTO @cust_id, @cust_name, @cust_email, @cust_phone;
 
WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY
        -- Validate data
        IF @cust_email IS NOT NULL AND @cust_email NOT LIKE '%@%'
        BEGIN
            RAISERROR('Invalid email format', 16, 1);
        END;
        
        -- Insert into new system
        INSERT INTO Customers(cust_id, cust_name, cust_email, cust_phone)
        VALUES(@cust_id, @cust_name, @cust_email, @cust_phone);
        
        -- Mark as migrated
        UPDATE LegacyCustomers 
        SET migration_status = 'SUCCESS',
            migration_date = GETDATE()
        WHERE CURRENT OF MigrationCursor;
        
    END TRY
    BEGIN CATCH
        -- Log error
        SET @error_count = @error_count + 1;
        
        UPDATE LegacyCustomers 
        SET migration_status = 'ERROR',
            migration_error = ERROR_MESSAGE()
        WHERE CURRENT OF MigrationCursor;
    END CATCH;
    
    FETCH NEXT FROM MigrationCursor 
    INTO @cust_id, @cust_name, @cust_email, @cust_phone;
END;
 
CLOSE MigrationCursor;
DEALLOCATE MigrationCursor;
 
PRINT 'Migration completed with ' + CAST(@error_count AS VARCHAR) + ' errors';

Batch Processing with Progress Tracking

-- Process large dataset in batches
DECLARE BatchCursor CURSOR FOR
SELECT order_num, cust_id, order_date
FROM Orders
WHERE status = 'PENDING'
ORDER BY order_date;
 
OPEN BatchCursor;
 
DECLARE @order_num INT, @cust_id CHAR(10), @order_date DATE;
DECLARE @processed_count INT = 0;
DECLARE @batch_size INT = 100;
 
FETCH NEXT FROM BatchCursor INTO @order_num, @cust_id, @order_date;
 
WHILE @@FETCH_STATUS = 0
BEGIN
    -- Process order
    EXEC ProcessOrder @order_num;
    
    SET @processed_count = @processed_count + 1;
    
    -- Progress reporting
    IF @processed_count % @batch_size = 0
    BEGIN
        PRINT 'Processed ' + CAST(@processed_count AS VARCHAR) + ' orders';
        
        -- Brief pause to allow other operations
        WAITFOR DELAY '00:00:01';
    END;
    
    FETCH NEXT FROM BatchCursor INTO @order_num, @cust_id, @order_date;
END;
 
CLOSE BatchCursor;
DEALLOCATE BatchCursor;
 
PRINT 'Total orders processed: ' + CAST(@processed_count AS VARCHAR);

Report Generation with Grouping

-- Generate hierarchical report
DECLARE ReportCursor CURSOR FOR
SELECT v.vend_name, p.prod_name, p.prod_price
FROM Vendors v
INNER JOIN Products p ON v.vend_id = p.vend_id
ORDER BY v.vend_name, p.prod_name;
 
OPEN ReportCursor;
 
DECLARE @vend_name VARCHAR(50), @prod_name VARCHAR(50), @prod_price MONEY;
DECLARE @current_vendor VARCHAR(50) = '';
DECLARE @vendor_total MONEY = 0;
 
FETCH NEXT FROM ReportCursor INTO @vend_name, @prod_name, @prod_price;
 
WHILE @@FETCH_STATUS = 0
BEGIN
    -- Check for vendor change
    IF @vend_name != @current_vendor
    BEGIN
        -- Print vendor total (if not first vendor)
        IF @current_vendor != ''
        BEGIN
            PRINT '  Vendor Total: $' + CAST(@vendor_total AS VARCHAR);
            PRINT '';
        END;
        
        -- Start new vendor section
        PRINT 'VENDOR: ' + @vend_name;
        PRINT '----------------------------------------';
        SET @current_vendor = @vend_name;
        SET @vendor_total = 0;
    END;
    
    -- Print product line
    PRINT '  ' + @prod_name + ' - $' + CAST(@prod_price AS VARCHAR);
    SET @vendor_total = @vendor_total + @prod_price;
    
    FETCH NEXT FROM ReportCursor INTO @vend_name, @prod_name, @prod_price;
END;
 
-- Print final vendor total
IF @current_vendor != ''
BEGIN
    PRINT '  Vendor Total: $' + CAST(@vendor_total AS VARCHAR);
END;
 
CLOSE ReportCursor;
DEALLOCATE ReportCursor;

Closing Cursors

Basic Closing

-- Most DBMSs
CLOSE CustCursor;
 
-- SQL Server also requires deallocation
CLOSE CustCursor;
DEALLOCATE CURSOR CustCursor;

Proper Cleanup Pattern

-- SQL Server: Safe cleanup
IF CURSOR_STATUS('global', 'CustCursor') >= -1
BEGIN
    IF CURSOR_STATUS('global', 'CustCursor') > -1
    BEGIN
        CLOSE CustCursor;
    END;
    DEALLOCATE CustCursor;
END;

Resource Management:

OperationPurposeRequired
CLOSERelease result set, maintain definitionAlways
DEALLOCATEFree cursor memory completelySQL Server, some others
ReopenCan reopen closed cursorAfter CLOSE, before DEALLOCATE

Performance Considerations

When to Use Cursors

-- ✅ Good: Row-by-row processing required
DECLARE ProcessCursor CURSOR FOR
SELECT customer_id, complex_calculation_needed
FROM Customers;
-- Each row needs individual processing logic
 
-- ❌ Bad: Set-based operation possible  
DECLARE UpdateCursor CURSOR FOR
SELECT customer_id FROM Customers;
-- Could use: UPDATE Customers SET status = 'ACTIVE'

Cursor vs Set-Based Operations

-- ❌ Slow: Cursor-based update
DECLARE SlowCursor CURSOR FOR
SELECT cust_id FROM Customers WHERE cust_state = 'CA';
 
OPEN SlowCursor;
DECLARE @cust_id CHAR(10);
FETCH NEXT FROM SlowCursor INTO @cust_id;
 
WHILE @@FETCH_STATUS = 0
BEGIN
    UPDATE Customers SET discount = 0.10 WHERE cust_id = @cust_id;
    FETCH NEXT FROM SlowCursor INTO @cust_id;
END;
 
CLOSE SlowCursor;
DEALLOCATE SlowCursor;
 
-- ✅ Fast: Set-based update
UPDATE Customers 
SET discount = 0.10 
WHERE cust_state = 'CA';

Optimization Tips:

  1. Use FAST_FORWARD cursors for sequential processing
  2. Limit result sets with WHERE clauses
  3. Avoid unnecessary columns in SELECT
  4. Process in batches for large datasets
  5. Consider alternatives like CTEs or window functions

Best Practices

1. Always Clean Up Resources

-- ✅ Good: Proper cleanup
DECLARE @cursor_name VARCHAR(50) = 'MyCursor';
 
BEGIN TRY
    DECLARE MyCursor CURSOR FOR SELECT * FROM Customers;
    OPEN MyCursor;
    
    -- Cursor operations here
    
END TRY
BEGIN CATCH
    -- Handle errors
    PRINT ERROR_MESSAGE();
END CATCH;
 
-- Cleanup in all cases
IF CURSOR_STATUS('local', @cursor_name) >= 0
BEGIN
    CLOSE MyCursor;
    DEALLOCATE MyCursor;
END;

2. Use Appropriate Cursor Types

-- For sequential processing
DECLARE FastCursor CURSOR FAST_FORWARD FOR
SELECT * FROM LargeTable;
 
-- For random access
DECLARE ScrollCursor CURSOR SCROLL STATIC FOR  
SELECT * FROM SmallTable;
 
-- For updates
DECLARE UpdateCursor CURSOR FOR
SELECT * FROM EditableTable
FOR UPDATE OF column_name;

3. Minimize Cursor Scope

-- ✅ Good: Local cursor
DECLARE LocalCursor CURSOR LOCAL FOR
SELECT * FROM Customers;
 
-- ❌ Avoid: Global cursor (unless necessary)
DECLARE GlobalCursor CURSOR GLOBAL FOR
SELECT * FROM Customers;

Common Cursor Patterns

Error Handling with Cursors

DECLARE ErrorCursor CURSOR FOR
SELECT risky_data FROM ProblematicTable;
 
OPEN ErrorCursor;
 
DECLARE @data VARCHAR(100);
DECLARE @error_count INT = 0;
 
FETCH NEXT FROM ErrorCursor INTO @data;
 
WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY
        -- Risky operation
        EXEC ProcessRiskyData @data;
    END TRY
    BEGIN CATCH
        SET @error_count = @error_count + 1;
        PRINT 'Error processing: ' + @data + ' - ' + ERROR_MESSAGE();
    END CATCH;
    
    FETCH NEXT FROM ErrorCursor INTO @data;
END;
 
CLOSE ErrorCursor;
DEALLOCATE ErrorCursor;
 
PRINT 'Processing completed with ' + CAST(@error_count AS VARCHAR) + ' errors';

Advanced Features

Understanding Constraints

Constraints are rules that govern how database data is inserted or manipulated. They enforce data integrity at the database level, ensuring consistent and valid data across all applications.

Why Use Database-Level Constraints?

ProblemClient-Side ValidationDatabase Constraints
ConsistencyEach client must implement rulesRules enforced universally
ReliabilitySome clients may skip validationDBMS always enforces rules
PerformanceNetwork round-trips for validationEfficient server-side checking
MaintenanceUpdate multiple applicationsChange rules in one place

Benefits of Constraints:

  • Data integrity: Prevent invalid data entry
  • Referential integrity: Maintain relationships between tables
  • Business rules: Enforce domain-specific requirements
  • Error prevention: Catch problems before they corrupt data

Primary Keys

Primary keys uniquely identify each row in a table and serve as the foundation for referential integrity.

Primary Key Requirements:

  1. Uniqueness: No two rows can have the same primary key value
  2. Non-null: Every row must have a primary key value
  3. Immutable: Primary key values cannot be modified
  4. Non-reusable: Deleted primary keys cannot be reassigned

Creating Primary Keys:

-- Method 1: Inline definition
CREATE TABLE Vendors (
    vend_id        CHAR(10)    NOT NULL PRIMARY KEY,
    vend_name      CHAR(50)    NOT NULL,
    vend_address   CHAR(50)    NULL,
    vend_city      CHAR(50)    NULL,
    vend_state     CHAR(5)     NULL,
    vend_zip       CHAR(10)    NULL,
    vend_country   CHAR(50)    NULL
);
 
-- Method 2: Constraint syntax
CREATE TABLE Vendors (
    vend_id        CHAR(10)    NOT NULL,
    vend_name      CHAR(50)    NOT NULL,
    -- other columns...
    CONSTRAINT pk_vendors PRIMARY KEY (vend_id)
);
 
-- Method 3: ALTER TABLE (most DBMSs)
ALTER TABLE Vendors
ADD CONSTRAINT pk_vendors PRIMARY KEY (vend_id);

Composite Primary Keys:

-- Multiple columns as primary key
CREATE TABLE OrderItems (
    order_num      INTEGER     NOT NULL,
    order_item     INTEGER     NOT NULL,
    prod_id        CHAR(10)    NOT NULL,
    quantity       INTEGER     NOT NULL,
    item_price     MONEY       NOT NULL,
    CONSTRAINT pk_orderitems PRIMARY KEY (order_num, order_item)
);

Foreign Keys

Foreign keys enforce referential integrity by ensuring values in one table correspond to valid values in another table's primary key.

Foreign Key Concepts:

-- Parent table (referenced)
CREATE TABLE Customers (
    cust_id        CHAR(10)    PRIMARY KEY,
    cust_name      VARCHAR(50) NOT NULL
);
 
-- Child table (referencing)
CREATE TABLE Orders (
    order_num      INTEGER     PRIMARY KEY,
    order_date     DATETIME    NOT NULL,
    cust_id        CHAR(10)    NOT NULL,
    CONSTRAINT fk_orders_customers 
        FOREIGN KEY (cust_id) REFERENCES Customers(cust_id)
);

Foreign Key Creation Methods:

-- Method 1: Inline REFERENCES
CREATE TABLE Orders (
    order_num    INTEGER    NOT NULL PRIMARY KEY,
    order_date   DATETIME   NOT NULL,
    cust_id      CHAR(10)   NOT NULL REFERENCES Customers(cust_id)
);
 
-- Method 2: CONSTRAINT syntax
ALTER TABLE Orders
ADD CONSTRAINT fk_orders_customers
FOREIGN KEY (cust_id) REFERENCES Customers(cust_id);
 
-- Method 3: Named constraint in CREATE TABLE
CREATE TABLE Orders (
    order_num    INTEGER    NOT NULL PRIMARY KEY,
    order_date   DATETIME   NOT NULL,
    cust_id      CHAR(10)   NOT NULL,
    CONSTRAINT fk_orders_customers 
        FOREIGN KEY (cust_id) REFERENCES Customers(cust_id)
);

Cascading Actions:

-- Cascade delete: Delete child records when parent is deleted
ALTER TABLE Orders
ADD CONSTRAINT fk_orders_customers
FOREIGN KEY (cust_id) REFERENCES Customers(cust_id)
ON DELETE CASCADE;
 
-- Cascade update: Update child records when parent key changes
ALTER TABLE Orders
ADD CONSTRAINT fk_orders_customers
FOREIGN KEY (cust_id) REFERENCES Customers(cust_id)
ON UPDATE CASCADE;
 
-- Set null: Set foreign key to NULL when parent is deleted
ALTER TABLE Orders
ADD CONSTRAINT fk_orders_customers
FOREIGN KEY (cust_id) REFERENCES Customers(cust_id)
ON DELETE SET NULL;
 
-- Restrict: Prevent deletion of parent if children exist (default)
ALTER TABLE Orders
ADD CONSTRAINT fk_orders_customers
FOREIGN KEY (cust_id) REFERENCES Customers(cust_id)
ON DELETE RESTRICT;

Benefits of Foreign Keys:

  • Referential integrity: Prevent orphaned records
  • Data consistency: Maintain valid relationships
  • Accidental deletion prevention: Protect related data
  • Documentation: Self-documenting table relationships

Unique Constraints

Unique constraints ensure column values are unique but allow more flexibility than primary keys.

Primary Key vs Unique Constraint:

FeaturePrimary KeyUnique Constraint
Quantity per tableOne onlyMultiple allowed
NULL valuesNot allowedAllowed
ModificationNot allowedAllowed
Reuse after deletionNot allowedAllowed
Foreign key referenceYesNo

Creating Unique Constraints:

-- Employee table with multiple unique constraints
CREATE TABLE Employees (
    emp_id         INTEGER     PRIMARY KEY,
    emp_ssn        CHAR(11)    UNIQUE,           -- Inline unique
    emp_email      VARCHAR(100),
    emp_phone      VARCHAR(20),
    emp_name       VARCHAR(100) NOT NULL,
    
    -- Named unique constraints
    CONSTRAINT uk_emp_email UNIQUE (emp_email),
    CONSTRAINT uk_emp_phone UNIQUE (emp_phone)
);
 
-- ALTER TABLE method
ALTER TABLE Employees
ADD CONSTRAINT uk_emp_badge UNIQUE (emp_badge_number);

Composite Unique Constraints:

-- Ensure combination of columns is unique
CREATE TABLE CourseEnrollments (
    enrollment_id  INTEGER     PRIMARY KEY,
    student_id     INTEGER     NOT NULL,
    course_id      INTEGER     NOT NULL,
    semester       VARCHAR(20) NOT NULL,
    
    -- Student can only enroll in same course once per semester
    CONSTRAINT uk_student_course_semester 
        UNIQUE (student_id, course_id, semester)
);

Check Constraints

Check constraints validate data against specific criteria before allowing insertion or updates.

Common Check Constraint Uses:

-- Range validation
CREATE TABLE Products (
    prod_id        CHAR(10)    PRIMARY KEY,
    prod_name      VARCHAR(50) NOT NULL,
    prod_price     MONEY       CHECK (prod_price > 0),
    prod_weight    DECIMAL(8,2) CHECK (prod_weight BETWEEN 0 AND 1000),
    prod_category  VARCHAR(20) CHECK (prod_category IN ('Electronics', 'Clothing', 'Books'))
);
 
-- Date validation
CREATE TABLE Orders (
    order_num      INTEGER     PRIMARY KEY,
    order_date     DATE        NOT NULL,
    ship_date      DATE,
    
    -- Ship date must be on or after order date
    CONSTRAINT chk_ship_after_order 
        CHECK (ship_date >= order_date)
);
 
-- Pattern validation
CREATE TABLE Customers (
    cust_id        CHAR(10)    PRIMARY KEY,
    cust_email     VARCHAR(100) CHECK (cust_email LIKE '%@%'),
    cust_phone     VARCHAR(15)  CHECK (cust_phone LIKE '[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]'),
    cust_gender    CHAR(1)      CHECK (cust_gender IN ('M', 'F', 'O'))
);

Complex Check Constraints:

-- Business rule validation
CREATE TABLE OrderItems (
    order_num      INTEGER     NOT NULL,
    order_item     INTEGER     NOT NULL,
    prod_id        CHAR(10)    NOT NULL,
    quantity       INTEGER     NOT NULL CHECK (quantity > 0),
    item_price     MONEY       NOT NULL CHECK (item_price >= 0),
    discount_pct   DECIMAL(5,2) DEFAULT 0,
    
    -- Discount cannot exceed 50%
    CONSTRAINT chk_discount_limit CHECK (discount_pct BETWEEN 0 AND 50),
    
    -- Total discount amount cannot exceed item price
    CONSTRAINT chk_discount_amount 
        CHECK ((item_price * quantity * discount_pct / 100) <= (item_price * quantity))
);
 
-- Cross-column validation
CREATE TABLE Employees (
    emp_id         INTEGER     PRIMARY KEY,
    hire_date      DATE        NOT NULL,
    termination_date DATE,
    birth_date     DATE        NOT NULL,
    
    -- Termination date must be after hire date
    CONSTRAINT chk_term_after_hire 
        CHECK (termination_date IS NULL OR termination_date > hire_date),
    
    -- Employee must be at least 16 years old when hired
    CONSTRAINT chk_minimum_age 
        CHECK (DATEDIFF(year, birth_date, hire_date) >= 16)
);

Adding Check Constraints:

-- Add to existing table
ALTER TABLE Products
ADD CONSTRAINT chk_prod_price_positive 
CHECK (prod_price > 0);
 
-- Multiple conditions
ALTER TABLE Orders
ADD CONSTRAINT chk_order_dates
CHECK (order_date <= GETDATE() AND 
       (ship_date IS NULL OR ship_date >= order_date));

Understanding Indexes

Indexes improve query performance by creating sorted references to table data, similar to book indexes.

How Indexes Work:

-- Without index: Full table scan
SELECT * FROM Customers WHERE cust_state = 'CA';
-- DBMS must examine every row
 
-- With index on cust_state: Index seek
CREATE INDEX idx_cust_state ON Customers(cust_state);
SELECT * FROM Customers WHERE cust_state = 'CA';
-- DBMS uses index to find matching rows quickly

Index Benefits and Costs:

| Benefits | Costs |-----|-----|----- | Faster SELECT queries | Slower INSERT/UPDATE/DELETE | Faster ORDER BY operations | Additional storage space | Faster JOIN operations | Index maintenance overhead | Faster WHERE clause filtering | Memory usage for index cache

Creating Indexes:

-- Simple index on single column
CREATE INDEX idx_prod_name ON Products(prod_name);
 
-- Composite index on multiple columns
CREATE INDEX idx_cust_name_state ON Customers(cust_name, cust_state);
 
-- Unique index (enforces uniqueness)
CREATE UNIQUE INDEX idx_emp_ssn ON Employees(emp_ssn);
 
-- Descending index
CREATE INDEX idx_order_date_desc ON Orders(order_date DESC);

Index Types and Options:

-- Clustered index (SQL Server) - physically orders table data
CREATE CLUSTERED INDEX idx_cust_id ON Customers(cust_id);
 
-- Non-clustered index (default) - separate structure pointing to data
CREATE NONCLUSTERED INDEX idx_cust_email ON Customers(cust_email);
 
-- Partial index (PostgreSQL) - index only rows meeting condition
CREATE INDEX idx_active_customers ON Customers(cust_name) 
WHERE cust_status = 'ACTIVE';
 
-- Covering index - includes additional columns for query optimization
CREATE INDEX idx_order_covering ON Orders(cust_id) 
INCLUDE (order_date, order_total);

Index Strategy Guidelines:

Good Candidates for Indexing:

  • Primary keys (usually indexed automatically)
  • Foreign keys (for JOIN performance)
  • Frequently searched columns (WHERE clauses)
  • Columns used in ORDER BY
  • Columns with high selectivity (many unique values)

Poor Candidates for Indexing:

  • Small tables (full scan is faster)
  • Columns with low selectivity (few unique values)
  • Frequently updated columns (high maintenance cost)
  • Very wide columns (large storage overhead)
-- ✅ Good: High selectivity, frequently searched
CREATE INDEX idx_customer_email ON Customers(cust_email);
CREATE INDEX idx_order_date ON Orders(order_date);
 
-- ❌ Poor: Low selectivity
CREATE INDEX idx_customer_gender ON Customers(cust_gender); -- Only M/F/O values
 
-- ✅ Good: Composite index for common query pattern
CREATE INDEX idx_product_category_price ON Products(prod_category, prod_price);
-- Supports: WHERE prod_category = 'Electronics' ORDER BY prod_price

Understanding Triggers

Triggers are special stored procedures that execute automatically in response to specific database events.

Trigger Types:

TimingEventDescription
BEFOREINSERTExecute before row insertion
AFTERINSERTExecute after row insertion
BEFOREUPDATEExecute before row modification
AFTERUPDATEExecute after row modification
BEFOREDELETEExecute before row deletion
AFTERDELETEExecute after row deletion
INSTEAD OFAnyReplace the triggering operation (views)

Common Trigger Uses:

1. Data Validation and Transformation:

-- SQL Server: Convert state to uppercase
CREATE TRIGGER tr_customer_state_upper
ON Customers
FOR INSERT, UPDATE
AS
BEGIN
    UPDATE Customers
    SET cust_state = UPPER(cust_state)
    WHERE cust_id IN (SELECT cust_id FROM inserted);
END;
 
-- Oracle/PostgreSQL version
CREATE TRIGGER tr_customer_state_upper
    BEFORE INSERT OR UPDATE ON Customers
    FOR EACH ROW
BEGIN
    :NEW.cust_state := UPPER(:NEW.cust_state);
END;

2. Audit Trail Creation:

-- Create audit table
CREATE TABLE CustomerAudit (
    audit_id       INTEGER IDENTITY PRIMARY KEY,
    cust_id        CHAR(10),
    action_type    VARCHAR(10),
    old_values     VARCHAR(1000),
    new_values     VARCHAR(1000),
    changed_by     VARCHAR(50),
    changed_date   DATETIME
);
 
-- SQL Server audit trigger
CREATE TRIGGER tr_customer_audit
ON Customers
FOR INSERT, UPDATE, DELETE
AS
BEGIN
    -- Log insertions
    INSERT INTO CustomerAudit(cust_id, action_type, new_values, changed_by, changed_date)
    SELECT cust_id, 'INSERT', 
           'Name: ' + cust_name + ', Email: ' + ISNULL(cust_email, 'NULL'),
           SYSTEM_USER, GETDATE()
    FROM inserted;
    
    -- Log updates
    INSERT INTO CustomerAudit(cust_id, action_type, old_values, new_values, changed_by, changed_date)
    SELECT i.cust_id, 'UPDATE',
           'Name: ' + d.cust_name + ', Email: ' + ISNULL(d.cust_email, 'NULL'),
           'Name: ' + i.cust_name + ', Email: ' + ISNULL(i.cust_email, 'NULL'),
           SYSTEM_USER, GETDATE()
    FROM inserted i
    INNER JOIN deleted d ON i.cust_id = d.cust_id;
    
    -- Log deletions
    INSERT INTO CustomerAudit(cust_id, action_type, old_values, changed_by, changed_date)
    SELECT cust_id, 'DELETE',
           'Name: ' + cust_name + ', Email: ' + ISNULL(cust_email, 'NULL'),
           SYSTEM_USER, GETDATE()
    FROM deleted;
END;

3. Business Rule Enforcement:

-- Prevent orders that exceed customer credit limit
CREATE TRIGGER tr_check_credit_limit
ON Orders
FOR INSERT, UPDATE
AS
BEGIN
    DECLARE @exceeded_customers TABLE (cust_id CHAR(10));
    
    -- Find customers who would exceed credit limit
    INSERT INTO @exceeded_customers
    SELECT i.cust_id
    FROM inserted i
    INNER JOIN Customers c ON i.cust_id = c.cust_id
    WHERE i.order_total > c.credit_limit;
    
    -- Reject transaction if any customer exceeds limit
    IF EXISTS (SELECT 1 FROM @exceeded_customers)
    BEGIN
        ROLLBACK TRANSACTION;
        RAISERROR('Order exceeds customer credit limit', 16, 1);
    END;
END;

4. Automatic Timestamp Updates:

-- Add timestamp columns to table
ALTER TABLE Products
ADD created_date DATETIME DEFAULT GETDATE(),
    modified_date DATETIME DEFAULT GETDATE();
 
-- Trigger to update modification timestamp
CREATE TRIGGER tr_product_timestamp
ON Products
FOR UPDATE
AS
BEGIN
    UPDATE Products
    SET modified_date = GETDATE()
    WHERE prod_id IN (SELECT prod_id FROM inserted);
END;

5. Inventory Management:

-- Update inventory when order items are added/modified
CREATE TRIGGER tr_update_inventory
ON OrderItems
FOR INSERT, UPDATE, DELETE
AS
BEGIN
    -- Handle insertions and updates
    UPDATE Products
    SET quantity_in_stock = quantity_in_stock - 
        (ISNULL(i.quantity, 0) - ISNULL(d.quantity, 0))
    FROM Products p
    INNER JOIN inserted i ON p.prod_id = i.prod_id
    LEFT JOIN deleted d ON i.order_num = d.order_num AND i.order_item = d.order_item;
    
    -- Handle deletions (restore inventory)
    UPDATE Products
    SET quantity_in_stock = quantity_in_stock + d.quantity
    FROM Products p
    INNER JOIN deleted d ON p.prod_id = d.prod_id
    WHERE NOT EXISTS (SELECT 1 FROM inserted WHERE order_num = d.order_num AND order_item = d.order_item);
END;

Trigger Best Practices:

-- ✅ Good: Keep triggers simple and fast
CREATE TRIGGER tr_simple_audit
ON Customers
FOR UPDATE
AS
BEGIN
    INSERT INTO AuditLog(table_name, action, user_name, timestamp)
    VALUES('Customers', 'UPDATE', SYSTEM_USER, GETDATE());
END;
 
-- ❌ Bad: Complex logic in trigger
CREATE TRIGGER tr_complex_business_logic
ON Orders
FOR INSERT
AS
BEGIN
    -- Multiple complex calculations
    -- External API calls
    -- Complex business rules
    -- This should be in stored procedures or application code
END;

Database Security

Database security protects data through user authentication, authorization, and access control mechanisms.

Security Layers:

  1. Authentication: Verify user identity
  2. Authorization: Control what users can do
  3. Access Control: Limit data visibility
  4. Auditing: Track user activities

User Management:

-- Create database users
CREATE USER 'app_user'@'localhost' IDENTIFIED BY 'secure_password';
CREATE USER 'report_user'@'%' IDENTIFIED BY 'report_password';
CREATE USER 'admin_user'@'localhost' IDENTIFIED BY 'admin_password';
 
-- Create roles for easier management
CREATE ROLE 'app_role';
CREATE ROLE 'reporting_role';
CREATE ROLE 'admin_role';

GRANT Statement - Giving Permissions:

-- Grant table-level permissions
GRANT SELECT ON Customers TO 'report_user'@'%';
GRANT INSERT, UPDATE ON Orders TO 'app_user'@'localhost';
GRANT ALL PRIVILEGES ON Products TO 'admin_user'@'localhost';
 
-- Grant column-level permissions
GRANT SELECT (cust_name, cust_email) ON Customers TO 'marketing_user';
GRANT UPDATE (prod_price) ON Products TO 'pricing_user';
 
-- Grant database-level permissions
GRANT CREATE, DROP ON sales_db.* TO 'admin_user'@'localhost';
 
-- Grant role permissions
GRANT 'app_role' TO 'app_user'@'localhost';
GRANT 'reporting_role' TO 'report_user'@'%';

REVOKE Statement - Removing Permissions:

-- Revoke specific permissions
REVOKE INSERT ON Orders FROM 'app_user'@'localhost';
REVOKE ALL PRIVILEGES ON Products FROM 'temp_user'@'%';
 
-- Revoke role membership
REVOKE 'admin_role' FROM 'former_admin'@'localhost';

Advanced Security Features:

-- Row-level security (SQL Server)
CREATE FUNCTION dbo.fn_security_predicate(@user_id INT)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN SELECT 1 AS result 
WHERE @user_id = USER_ID() OR IS_MEMBER('db_owner') = 1;
 
-- Apply security policy
CREATE SECURITY POLICY CustomerSecurityPolicy
ADD FILTER PREDICATE dbo.fn_security_predicate(cust_id) ON Customers,
ADD BLOCK PREDICATE dbo.fn_security_predicate(cust_id) ON Customers;
 
-- View-based security
CREATE VIEW CustomerSafeView AS
SELECT cust_id, cust_name, cust_city, cust_state
FROM Customers
WHERE cust_region = USER_REGION();
 
-- Grant access to view, not table
GRANT SELECT ON CustomerSafeView TO 'regional_user';
REVOKE ALL ON Customers FROM 'regional_user';

Security Best Practices:

1. Principle of Least Privilege:

-- ✅ Good: Minimal necessary permissions
GRANT SELECT ON Orders TO 'report_user';
GRANT INSERT, UPDATE ON OrderItems TO 'order_entry_user';
 
-- ❌ Bad: Excessive permissions
GRANT ALL PRIVILEGES ON *.* TO 'app_user'; -- Too broad

2. Use Roles for Permission Management:

-- Create functional roles
CREATE ROLE order_processor;
CREATE ROLE customer_service;
CREATE ROLE financial_analyst;
 
-- Assign permissions to roles
GRANT SELECT, INSERT, UPDATE ON Orders TO order_processor;
GRANT SELECT ON Customers TO customer_service;
GRANT SELECT ON financial_tables TO financial_analyst;
 
-- Assign roles to users
GRANT order_processor TO 'john_doe';
GRANT customer_service TO 'jane_smith';

3. Secure Sensitive Data:

-- Create separate schema for sensitive data
CREATE SCHEMA sensitive_data;
 
-- Move sensitive tables
ALTER SCHEMA sensitive_data TRANSFER dbo.EmployeeSalaries;
ALTER SCHEMA sensitive_data TRANSFER dbo.CustomerCreditCards;
 
-- Restrict access to schema
REVOKE ALL ON SCHEMA::sensitive_data FROM public;
GRANT SELECT ON SCHEMA::sensitive_data TO 'hr_manager';

Putting It All Together

Complete Table with All Constraint Types:

CREATE TABLE ComprehensiveOrders (
    -- Primary key
    order_id       INTEGER IDENTITY(1,1) PRIMARY KEY,
    
    -- Foreign keys
    cust_id        CHAR(10) NOT NULL,
    emp_id         INTEGER,
    
    -- Unique constraints
    order_number   VARCHAR(20) NOT NULL UNIQUE,
    
    -- Check constraints
    order_date     DATE NOT NULL CHECK (order_date <= GETDATE()),
    ship_date      DATE CHECK (ship_date >= order_date),
    order_total    MONEY CHECK (order_total >= 0),
    order_status   VARCHAR(20) DEFAULT 'PENDING' 
                   CHECK (order_status IN ('PENDING', 'PROCESSING', 'SHIPPED', 'DELIVERED', 'CANCELLED')),
    
    -- Timestamps
    created_date   DATETIME DEFAULT GETDATE(),
    modified_date  DATETIME DEFAULT GETDATE(),
    
    -- Foreign key constraints
    CONSTRAINT fk_orders_customers 
        FOREIGN KEY (cust_id) REFERENCES Customers(cust_id),
    CONSTRAINT fk_orders_employees 
        FOREIGN KEY (emp_id) REFERENCES Employees(emp_id),
    
    -- Additional check constraints
    CONSTRAINT chk_order_dates 
        CHECK (ship_date IS NULL OR ship_date >= order_date),
    CONSTRAINT chk_order_total_reasonable 
        CHECK (order_total BETWEEN 0 AND 1000000)
);
 
-- Create indexes for performance
CREATE INDEX idx_orders_customer ON ComprehensiveOrders(cust_id);
CREATE INDEX idx_orders_date ON ComprehensiveOrders(order_date);
CREATE INDEX idx_orders_status ON ComprehensiveOrders(order_status);
 
-- Create trigger for automatic timestamp updates
CREATE TRIGGER tr_orders_timestamp
ON ComprehensiveOrders
FOR UPDATE
AS
BEGIN
    UPDATE ComprehensiveOrders
    SET modified_date = GETDATE()
    WHERE order_id IN (SELECT order_id FROM inserted);
END;

Statement Syntax

Reading Syntax Conventions

Symbol Meanings:

  • | indicates alternatives (choose one): NULL|NOT NULL
  • [ ] indicates optional elements: [WHERE ...]
  • ... indicates repeatable elements: column, column, ...
  • UPPERCASE indicates SQL keywords
  • lowercase indicates user-defined names

ALTER TABLE

Purpose: Modify the structure of an existing table

ALTER TABLE tablename
(
    ADD|DROP column datatype [NULL|NOT NULL] [CONSTRAINTS],
    ADD|DROP column datatype [NULL|NOT NULL] [CONSTRAINTS],
    ...
);

Examples:

-- Add a new column
ALTER TABLE Customers
ADD cust_phone VARCHAR(15) NULL;
 
-- Add multiple columns
ALTER TABLE Products
(
    ADD prod_weight DECIMAL(8,2) NULL,
    ADD prod_color VARCHAR(20) NOT NULL DEFAULT 'Unknown'
);
 
-- Drop a column
ALTER TABLE Customers
DROP COLUMN cust_fax;
 
-- Add constraint
ALTER TABLE Orders
ADD CONSTRAINT fk_orders_customers 
FOREIGN KEY (cust_id) REFERENCES Customers(cust_id);
 
-- Modify column datatype
ALTER TABLE Products
ALTER COLUMN prod_price DECIMAL(10,2) NOT NULL;

Related: See Lesson 17 - Creating and Manipulating Tables


COMMIT

Purpose: Save transaction changes to the database

COMMIT [TRANSACTION];

Examples:

-- Basic commit
BEGIN TRANSACTION;
    INSERT INTO Customers VALUES(...);
    UPDATE Orders SET status = 'SHIPPED';
COMMIT;
 
-- Named transaction commit
BEGIN TRANSACTION OrderProcessing;
    -- Multiple operations
COMMIT TRANSACTION OrderProcessing;
 
-- Implicit commit (auto-commit mode)
INSERT INTO Products VALUES(...);  -- Automatically committed

Related: See Lesson 20 - Managing Transaction Processing


CREATE INDEX

Purpose: Create an index to improve query performance

CREATE INDEX indexname
ON tablename (column, ...);

Examples:

-- Simple index on single column
CREATE INDEX idx_cust_name 
ON Customers (cust_name);
 
-- Composite index on multiple columns
CREATE INDEX idx_order_date_customer 
ON Orders (order_date, cust_id);
 
-- Unique index
CREATE UNIQUE INDEX idx_prod_code 
ON Products (prod_code);
 
-- Descending index
CREATE INDEX idx_order_date_desc 
ON Orders (order_date DESC);
 
-- Partial index (PostgreSQL)
CREATE INDEX idx_active_customers 
ON Customers (cust_name) 
WHERE cust_status = 'ACTIVE';

Related: See Lesson 22 - Understanding Advanced SQL Features


CREATE PROCEDURE

Purpose: Create a stored procedure for reusable SQL code

CREATE PROCEDURE procedurename [parameters] [options]
AS
SQL statement;

Examples:

-- SQL Server procedure
CREATE PROCEDURE GetCustomerOrders
    @cust_id CHAR(10)
AS
BEGIN
    SELECT * FROM Orders 
    WHERE cust_id = @cust_id;
END;
 
-- Procedure with output parameter
CREATE PROCEDURE GetOrderTotal
    @order_num INTEGER,
    @total MONEY OUTPUT
AS
BEGIN
    SELECT @total = SUM(quantity * item_price)
    FROM OrderItems
    WHERE order_num = @order_num;
END;
 
-- Oracle procedure syntax
CREATE PROCEDURE GetCustomerOrders(
    p_cust_id IN VARCHAR2
)
IS
BEGIN
    SELECT * FROM Orders 
    WHERE cust_id = p_cust_id;
END;

Related: See Lesson 19 - Working with Stored Procedures


CREATE TABLE

Purpose: Create a new database table

CREATE TABLE tablename
(
    column    datatype    [NULL|NOT NULL]    [CONSTRAINTS],
    column    datatype    [NULL|NOT NULL]    [CONSTRAINTS],
    ...
);

Examples:

-- Basic table creation
CREATE TABLE Customers
(
    cust_id      CHAR(10)     NOT NULL PRIMARY KEY,
    cust_name    VARCHAR(50)  NOT NULL,
    cust_email   VARCHAR(100) NULL,
    cust_phone   VARCHAR(15)  NULL
);
 
-- Table with constraints
CREATE TABLE Orders
(
    order_num    INTEGER      NOT NULL PRIMARY KEY,
    order_date   DATE         NOT NULL DEFAULT GETDATE(),
    cust_id      CHAR(10)     NOT NULL,
    order_total  MONEY        CHECK (order_total >= 0),
    
    CONSTRAINT fk_orders_customers 
        FOREIGN KEY (cust_id) REFERENCES Customers(cust_id)
);
 
-- Table with auto-increment
CREATE TABLE Products
(
    prod_id      INTEGER      IDENTITY(1,1) PRIMARY KEY,
    prod_name    VARCHAR(100) NOT NULL,
    prod_price   DECIMAL(8,2) NOT NULL CHECK (prod_price > 0),
    created_date DATETIME     DEFAULT GETDATE()
);

Related: See Lesson 17 - Creating and Manipulating Tables


CREATE VIEW

Purpose: Create a virtual table based on a SELECT statement

CREATE VIEW viewname AS
SELECT columns, ...
FROM tables, ...
[WHERE ...]
[GROUP BY ...]
[HAVING ...];

Examples:

-- Simple view
CREATE VIEW CustomerOrders AS
SELECT c.cust_name, o.order_num, o.order_date
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id;
 
-- View with calculations
CREATE VIEW OrderTotals AS
SELECT order_num,
       SUM(quantity * item_price) AS order_total,
       COUNT(*) AS item_count
FROM OrderItems
GROUP BY order_num;
 
-- Filtered view
CREATE VIEW ActiveCustomers AS
SELECT cust_id, cust_name, cust_email
FROM Customers
WHERE cust_status = 'ACTIVE'
  AND cust_email IS NOT NULL;
 
-- Complex view with joins and aggregation
CREATE VIEW ProductSales AS
SELECT p.prod_name,
       p.prod_price,
       SUM(oi.quantity) AS total_sold,
       SUM(oi.quantity * oi.item_price) AS total_revenue
FROM Products p
LEFT JOIN OrderItems oi ON p.prod_id = oi.prod_id
GROUP BY p.prod_id, p.prod_name, p.prod_price;

Related: See Lesson 18 - Using Views


DELETE

Purpose: Remove one or more rows from a table

DELETE FROM tablename
[WHERE ...];

Examples:

-- Delete specific rows
DELETE FROM Customers
WHERE cust_state = 'CA';
 
-- Delete with subquery
DELETE FROM Orders
WHERE cust_id IN (
    SELECT cust_id FROM Customers 
    WHERE cust_status = 'INACTIVE'
);
 
-- Delete all rows (use with caution!)
DELETE FROM TempTable;
 
-- Delete with JOIN (SQL Server)
DELETE o
FROM Orders o
INNER JOIN Customers c ON o.cust_id = c.cust_id
WHERE c.cust_status = 'CLOSED';
 
-- Delete with LIMIT (MySQL)
DELETE FROM Orders
WHERE order_date < '2020-01-01'
LIMIT 100;

Related: See Lesson 16 - Updating and Deleting Data


DROP

Purpose: Permanently remove database objects

DROP INDEX|PROCEDURE|TABLE|VIEW indexname|procedurename|tablename|viewname;

Examples:

-- Drop table
DROP TABLE TempCustomers;
 
-- Drop view
DROP VIEW CustomerOrders;
 
-- Drop index
DROP INDEX idx_cust_name;
 
-- Drop procedure
DROP PROCEDURE GetCustomerOrders;
 
-- Drop with IF EXISTS (safer)
DROP TABLE IF EXISTS TempTable;
DROP VIEW IF EXISTS OldView;
 
-- Drop multiple objects
DROP TABLE Table1, Table2, Table3;

Related: See Lessons 17 and 18


INSERT

Purpose: Add a single row to a table

INSERT INTO tablename [(columns, ...)]
VALUES (values, ...);

Examples:

-- Insert with all columns
INSERT INTO Customers
VALUES ('1000000006', 'Toy Land', '123 Any Street', 'New York', 'NY', '11111', 'USA', NULL, NULL);
 
-- Insert with specified columns
INSERT INTO Customers (cust_id, cust_name, cust_city, cust_state)
VALUES ('1000000007', 'The Toy Store', 'Chicago', 'IL');
 
-- Insert multiple rows
INSERT INTO Products (prod_name, prod_price)
VALUES 
    ('Product A', 19.99),
    ('Product B', 29.99),
    ('Product C', 39.99);
 
-- Insert with DEFAULT values
INSERT INTO Orders (cust_id, order_date)
VALUES ('1000000001', DEFAULT);
 
-- Insert with calculated values
INSERT INTO OrderSummary (order_num, order_total)
VALUES (20001, (SELECT SUM(quantity * item_price) FROM OrderItems WHERE order_num = 20001));

Related: See Lesson 15 - Inserting Data


INSERT SELECT

Purpose: Insert the results of a SELECT statement into a table

INSERT INTO tablename [(columns, ...)]
SELECT columns, ...
FROM tablename, ...
[WHERE ...];

Examples:

-- Copy all data
INSERT INTO CustomersBackup
SELECT * FROM Customers;
 
-- Copy specific columns
INSERT INTO CustomerContacts (cust_id, cust_name, cust_email)
SELECT cust_id, cust_name, cust_email
FROM Customers
WHERE cust_email IS NOT NULL;
 
-- Insert with calculations
INSERT INTO MonthlySales (month_year, total_sales)
SELECT 
    FORMAT(order_date, 'yyyy-MM') AS month_year,
    SUM(order_total) AS total_sales
FROM Orders
WHERE order_date >= '2023-01-01'
GROUP BY FORMAT(order_date, 'yyyy-MM');
 
-- Insert from multiple tables
INSERT INTO CustomerOrderSummary (cust_id, cust_name, order_count, total_spent)
SELECT 
    c.cust_id,
    c.cust_name,
    COUNT(o.order_num) AS order_count,
    SUM(o.order_total) AS total_spent
FROM Customers c
LEFT JOIN Orders o ON c.cust_id = o.cust_id
GROUP BY c.cust_id, c.cust_name;

Related: See Lesson 15 - Inserting Data


ROLLBACK

Purpose: Undo transaction changes

ROLLBACK [TO savepointname];
-- or
ROLLBACK TRANSACTION;

Examples:

-- Basic rollback
BEGIN TRANSACTION;
    DELETE FROM Orders WHERE order_date < '2020-01-01';
    -- Oops, wrong date!
ROLLBACK;
 
-- Rollback to savepoint
BEGIN TRANSACTION;
    INSERT INTO Customers VALUES(...);
    SAVE TRANSACTION customer_added;
    
    INSERT INTO Orders VALUES(...);
    -- Error occurred
    ROLLBACK TRANSACTION customer_added;  -- Keep customer, undo order
COMMIT;
 
-- Conditional rollback
BEGIN TRANSACTION;
    UPDATE Inventory SET quantity = quantity - 10 WHERE prod_id = 'P001';
    
    IF (SELECT quantity FROM Inventory WHERE prod_id = 'P001') < 0
    BEGIN
        ROLLBACK;
        RAISERROR('Insufficient inventory', 16, 1);
    END
    ELSE
    BEGIN
        COMMIT;
    END

Related: See Lesson 20 - Managing Transaction Processing


SELECT

Purpose: Retrieve data from one or more tables or views

SELECT columnname, ...
FROM tablename, ...
[WHERE ...]
[UNION ...]
[GROUP BY ...]
[HAVING ...]
[ORDER BY ...];

Examples:

-- Basic SELECT
SELECT cust_name, cust_email
FROM Customers;
 
-- SELECT with WHERE clause
SELECT prod_name, prod_price
FROM Products
WHERE prod_price BETWEEN 10 AND 50;
 
-- SELECT with JOIN
SELECT c.cust_name, o.order_num, o.order_date
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id;
 
-- SELECT with aggregation
SELECT cust_state, COUNT(*) AS customer_count
FROM Customers
GROUP BY cust_state
HAVING COUNT(*) > 5
ORDER BY customer_count DESC;
 
-- SELECT with subquery
SELECT cust_name
FROM Customers
WHERE cust_id IN (
    SELECT DISTINCT cust_id 
    FROM Orders 
    WHERE order_date >= '2023-01-01'
);
 
-- SELECT with UNION
SELECT cust_name, cust_city FROM Customers WHERE cust_state = 'CA'
UNION
SELECT cust_name, cust_city FROM Customers WHERE cust_state = 'NY';
 
-- Complex SELECT with window functions
SELECT 
    cust_name,
    order_total,
    ROW_NUMBER() OVER (PARTITION BY cust_state ORDER BY order_total DESC) AS rank_in_state
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id;

Related: See Lessons 2-14 covering all aspects of SELECT


UPDATE

Purpose: Modify one or more rows in a table

UPDATE tablename
SET columnname = value, ...
[WHERE ...];

Examples:

-- Update specific rows
UPDATE Customers
SET cust_email = 'newemail@example.com'
WHERE cust_id = '1000000001';
 
-- Update multiple columns
UPDATE Products
SET prod_price = prod_price * 1.1,
    modified_date = GETDATE()
WHERE prod_category = 'Electronics';
 
-- Update with calculation
UPDATE OrderItems
SET item_price = (
    SELECT prod_price 
    FROM Products 
    WHERE Products.prod_id = OrderItems.prod_id
);
 
-- Update with JOIN (SQL Server)
UPDATE oi
SET item_price = p.prod_price
FROM OrderItems oi
INNER JOIN Products p ON oi.prod_id = p.prod_id;
 
-- Conditional update
UPDATE Customers
SET cust_status = CASE 
    WHEN last_order_date < DATEADD(year, -1, GETDATE()) THEN 'INACTIVE'
    WHEN last_order_date < DATEADD(month, -6, GETDATE()) THEN 'DORMANT'
    ELSE 'ACTIVE'
END;
 
-- Update all rows (use with caution!)
UPDATE Products
SET modified_date = GETDATE();

Related: See Lesson 16 - Updating and Deleting Data


Advanced Syntax Patterns

Common Table Expressions (CTE):

WITH CustomerTotals AS (
    SELECT cust_id, SUM(order_total) AS total_spent
    FROM Orders
    GROUP BY cust_id
)
SELECT c.cust_name, ct.total_spent
FROM Customers c
INNER JOIN CustomerTotals ct ON c.cust_id = ct.cust_id;

Window Functions:

SELECT 
    cust_name,
    order_total,
    ROW_NUMBER() OVER (ORDER BY order_total DESC) AS rank,
    SUM(order_total) OVER (PARTITION BY cust_state) AS state_total
FROM Customers c
INNER JOIN Orders o ON c.cust_id = o.cust_id;

MERGE Statement (SQL Server):

MERGE CustomerSummary AS target
USING (
    SELECT cust_id, COUNT(*) AS order_count, SUM(order_total) AS total_spent
    FROM Orders
    GROUP BY cust_id
) AS source ON target.cust_id = source.cust_id
WHEN MATCHED THEN
    UPDATE SET order_count = source.order_count, total_spent = source.total_spent
WHEN NOT MATCHED THEN
    INSERT (cust_id, order_count, total_spent)
    VALUES (source.cust_id, source.order_count, source.total_spent);

Quick Reference Summary

StatementPurposeKey Clauses
SELECTRetrieve dataFROM, WHERE, GROUP BY, HAVING, ORDER BY
INSERTAdd new rowsINTO, VALUES / SELECT
UPDATEModify existing rowsSET, WHERE
DELETERemove rowsFROM, WHERE
CREATE TABLECreate new tableColumn definitions, constraints
ALTER TABLEModify table structureADD, DROP, ALTER
CREATE VIEWCreate virtual tableAS SELECT
CREATE INDEXImprove performanceON table(columns)
CREATE PROCEDURECreate stored procedureParameters, AS, SQL statements
COMMITSave transactionOptional TRANSACTION
ROLLBACKUndo transactionOptional TO savepoint
DROPRemove objectsObject type and name

Database-Specific Variations

Remember that while this syntax works with most DBMSs, specific implementations may vary:

  • SQL Server: Uses IDENTITY for auto-increment, GETDATE() for current date
  • MySQL: Uses AUTO_INCREMENT, NOW() for current timestamp
  • PostgreSQL: Uses SERIAL for auto-increment, CURRENT_TIMESTAMP
  • Oracle: Uses SEQUENCE objects, SYSDATE for current date
  • SQLite: Limited ALTER TABLE support, uses AUTOINCREMENT

Datatypes

Understanding Datatypes

Datatypes are rules that define what data may be stored in a column and how that data is actually stored internally by the database management system.

Why Datatypes Matter:

BenefitDescriptionExample
Data ValidationRestrict invalid data entryNumeric column rejects text input
Storage EfficiencyOptimize space usageINTEGER uses 4 bytes vs VARCHAR(50)
Correct SortingEnable proper orderingNumbers: 1, 2, 10 vs Strings: "1", "10", "2"
PerformanceEnable indexing and optimizationFixed-length fields process faster
Data IntegrityPrevent data corruptionDate fields reject invalid dates

Critical Considerations:

-- ❌ Wrong datatype choice
CREATE TABLE Orders (
    order_id VARCHAR(10),     -- Should be INTEGER
    zip_code INTEGER,         -- Should be VARCHAR (01234 becomes 1234)
    order_date VARCHAR(20),   -- Should be DATE
    is_shipped INTEGER        -- Should be BOOLEAN/BIT
);
 
-- ✅ Correct datatype choice
CREATE TABLE Orders (
    order_id INTEGER PRIMARY KEY,
    zip_code VARCHAR(10),     -- Preserves leading zeros
    order_date DATE,          -- Proper date validation and functions
    is_shipped BIT            -- Boolean true/false values
);

String Datatypes

String datatypes store text data including names, addresses, descriptions, and codes.

Fixed-Length vs Variable-Length:

| Type | Storage | Performance | Use Case |-----|-----|----- | Fixed-Length | Always uses full allocated space | Faster processing and indexing | Codes, IDs, consistent-length data | Variable-Length | Uses only space needed | Slower but space-efficient | Names, descriptions, variable content

Common String Datatypes:

-- Fixed-length strings
CHAR(n)         -- Exactly n characters, padded with spaces
NCHAR(n)        -- Unicode fixed-length
 
-- Variable-length strings  
VARCHAR(n)      -- Up to n characters
NVARCHAR(n)     -- Unicode variable-length
TEXT            -- Large text blocks (also LONG, MEMO)
CLOB            -- Character Large Object

Practical Examples:

CREATE TABLE Customers (
    -- Fixed-length: consistent size, faster processing
    cust_id        CHAR(10),           -- Always 10 characters: "CUST000001"
    state_code     CHAR(2),            -- Always 2 characters: "CA", "NY"
    country_code   CHAR(3),            -- Always 3 characters: "USA", "CAN"
    
    -- Variable-length: space-efficient for varying content
    cust_name      VARCHAR(100),       -- Up to 100 characters
    cust_address   VARCHAR(255),       -- Up to 255 characters
    cust_notes     TEXT,               -- Large text blocks
    
    -- Unicode support for international characters
    cust_name_local NVARCHAR(100)      -- Unicode names: "José", "北京"
);

String Datatype Guidelines:

-- ✅ Good practices
CREATE TABLE Products (
    prod_code      CHAR(8),            -- Fixed format: "PROD0001"
    prod_name      VARCHAR(100),       -- Variable names
    prod_desc      TEXT,               -- Long descriptions
    sku            VARCHAR(20),        -- Variable SKU lengths
    barcode        VARCHAR(13)         -- UPC codes (preserve leading zeros)
);
 
-- ❌ Common mistakes
CREATE TABLE BadExample (
    phone_number   INTEGER,            -- Loses leading zeros, formatting
    zip_code       INTEGER,            -- "01234" becomes 1234
    product_name   CHAR(200),          -- Wastes space for short names
    description    VARCHAR(50)         -- Too short for descriptions
);

Database-Specific Variations:

-- SQL Server
NVARCHAR(MAX)   -- Up to 2GB Unicode text
VARCHAR(MAX)    -- Up to 2GB non-Unicode text
 
-- MySQL
TINYTEXT        -- Up to 255 characters
MEDIUMTEXT      -- Up to 16MB
LONGTEXT        -- Up to 4GB
 
-- PostgreSQL
TEXT            -- Unlimited length (practical limit ~1GB)
VARCHAR         -- Without length limit
 
-- Oracle
VARCHAR2(n)     -- Variable-length (Oracle's preferred VARCHAR)
CLOB            -- Up to 4GB character data
NCLOB           -- Unicode CLOB

Numeric Datatypes

Numeric datatypes store numbers for calculations, measurements, and counting.

Integer Types:

-- Common integer datatypes
TINYINT         -- 1 byte: 0 to 255 (or -128 to 127)
SMALLINT        -- 2 bytes: -32,768 to 32,767  
INT/INTEGER     -- 4 bytes: -2,147,483,648 to 2,147,483,647
BIGINT          -- 8 bytes: very large integers
BIT             -- 1 bit: 0 or 1 (boolean flags)

Decimal and Floating-Point Types:

-- Exact decimal numbers
DECIMAL(p,s)    -- p=precision (total digits), s=scale (decimal places)
NUMERIC(p,s)    -- Same as DECIMAL
 
-- Floating-point numbers
FLOAT           -- Approximate floating-point
REAL            -- 4-byte floating-point
DOUBLE          -- 8-byte floating-point (higher precision)

Practical Examples:

CREATE TABLE Products (
    prod_id        INTEGER,            -- Whole numbers for IDs
    prod_price     DECIMAL(10,2),      -- Money: 99999999.99
    prod_weight    DECIMAL(8,3),       -- Weight: 12345.678 kg
    prod_rating    DECIMAL(3,2),       -- Rating: 4.75 out of 5
    stock_count    INTEGER,            -- Whole number count
    is_featured    BIT,                -- Boolean: 1=yes, 0=no
    discount_pct   DECIMAL(5,2)        -- Percentage: 15.50%
);
 
CREATE TABLE OrderItems (
    quantity       SMALLINT,           -- Small positive numbers
    item_price     DECIMAL(8,2),       -- Currency values
    line_total     DECIMAL(10,2),      -- Calculated totals
    tax_rate       DECIMAL(6,4)        -- Tax: 0.0825 (8.25%)
);

Choosing the Right Numeric Type:

-- ✅ Appropriate choices
CREATE TABLE Measurements (
    temperature    DECIMAL(5,2),       -- -999.99 to 999.99°C
    pressure       DECIMAL(8,3),       -- Precise measurements
    count          INTEGER,            -- Whole number counts
    percentage     DECIMAL(5,2),       -- 0.00 to 100.00%
    is_active      BIT                 -- Boolean flag
);
 
-- ❌ Poor choices
CREATE TABLE BadMeasurements (
    temperature    INTEGER,            -- Loses decimal precision
    money_amount   FLOAT,              -- Rounding errors in currency
    small_count    BIGINT,             -- Wastes space
    percentage     INTEGER             -- Can't represent 15.5%
);

Currency Handling:

-- Dedicated currency types
MONEY           -- SQL Server: 8-byte currency
SMALLMONEY      -- SQL Server: 4-byte currency
 
-- Best practice: Use DECIMAL for currency
CREATE TABLE Orders (
    order_total    DECIMAL(12,2),      -- Up to $9,999,999,999.99
    tax_amount     DECIMAL(10,2),      -- Tax calculations
    shipping_cost  DECIMAL(8,2)        -- Shipping fees
);

Date and Time Datatypes

Date and time datatypes store temporal data with various levels of precision.

Common Date/Time Types:

DATE            -- Date only: 2024-12-25
TIME            -- Time only: 14:30:00
DATETIME        -- Date and time: 2024-12-25 14:30:00
TIMESTAMP       -- Date/time with timezone info
SMALLDATETIME   -- Date/time with minute precision

Database-Specific Implementations:

-- SQL Server
DATETIME        -- 1753-9999, 3.33ms precision
DATETIME2       -- 0001-9999, 100ns precision  
SMALLDATETIME   -- 1900-2079, 1 minute precision
DATE            -- Date only
TIME            -- Time only
DATETIMEOFFSET  -- With timezone offset
 
-- MySQL
DATE            -- 1000-01-01 to 9999-12-31
DATETIME        -- 1000-01-01 00:00:00 to 9999-12-31 23:59:59
TIMESTAMP       -- 1970-01-01 00:00:01 to 2038-01-19 03:14:07
TIME            -- -838:59:59 to 838:59:59
YEAR            -- 1901 to 2155
 
-- PostgreSQL
DATE            -- Date only
TIME            -- Time only
TIMESTAMP       -- Date and time
TIMESTAMPTZ     -- With timezone
INTERVAL        -- Time intervals
 
-- Oracle
DATE            -- Date and time (no fractional seconds)
TIMESTAMP       -- With fractional seconds
TIMESTAMP WITH TIME ZONE
TIMESTAMP WITH LOCAL TIME ZONE

Practical Examples:

CREATE TABLE Events (
    event_id       INTEGER PRIMARY KEY,
    event_name     VARCHAR(100),
    event_date     DATE,               -- 2024-07-04
    start_time     TIME,               -- 14:30:00
    end_time       TIME,               -- 17:00:00
    created_at     DATETIME,           -- 2024-01-15 09:30:00
    updated_at     TIMESTAMP           -- Auto-updated timestamp
);
 
CREATE TABLE Orders (
    order_id       INTEGER PRIMARY KEY,
    order_date     DATE,               -- When order was placed
    ship_date      DATE,               -- When order shipped
    created_at     DATETIME DEFAULT GETDATE(),
    modified_at    DATETIME DEFAULT GETDATE()
);

Date Format Considerations:

-- ✅ ISO format (recommended)
INSERT INTO Orders (order_date) VALUES ('2024-12-25');
INSERT INTO Events (start_time) VALUES ('14:30:00');
 
-- ✅ ODBC format (portable)
INSERT INTO Orders (order_date) VALUES ({d '2024-12-25'});
INSERT INTO Events (start_time) VALUES ({t '14:30:00'});
INSERT INTO Logs (timestamp) VALUES ({ts '2024-12-25 14:30:00'});
 
-- ❌ Ambiguous formats (avoid)
INSERT INTO Orders (order_date) VALUES ('12/25/2024');  -- US format
INSERT INTO Orders (order_date) VALUES ('25/12/2024');  -- European format

Working with Timezones:

-- PostgreSQL timezone handling
CREATE TABLE GlobalEvents (
    event_id       SERIAL PRIMARY KEY,
    event_name     VARCHAR(100),
    event_time     TIMESTAMP WITH TIME ZONE,
    local_time     TIMESTAMP WITHOUT TIME ZONE
);
 
-- SQL Server timezone handling
CREATE TABLE GlobalEvents (
    event_id       INT IDENTITY PRIMARY KEY,
    event_name     VARCHAR(100),
    event_time     DATETIMEOFFSET,     -- 2024-12-25 14:30:00 -08:00
    utc_time       DATETIME2           -- UTC equivalent
);

Binary Datatypes

Binary datatypes store non-text data including images, documents, and multimedia files.

Common Binary Types:

BINARY(n)       -- Fixed-length binary data
VARBINARY(n)    -- Variable-length binary data
BLOB            -- Binary Large Object
IMAGE           -- Legacy image storage (deprecated)

Database-Specific Binary Types:

-- SQL Server
BINARY(n)       -- Fixed-length: 1 to 8,000 bytes
VARBINARY(n)    -- Variable-length: 1 to 8,000 bytes
VARBINARY(MAX)  -- Up to 2GB
IMAGE           -- Deprecated, use VARBINARY(MAX)
 
-- MySQL
BINARY(n)       -- Fixed-length binary string
VARBINARY(n)    -- Variable-length binary string
TINYBLOB        -- Up to 255 bytes
BLOB            -- Up to 65KB
MEDIUMBLOB      -- Up to 16MB
LONGBLOB        -- Up to 4GB
 
-- PostgreSQL
BYTEA           -- Variable-length binary data
 
-- Oracle
RAW(n)          -- Fixed-length: up to 2,000 bytes
LONG RAW        -- Variable-length: up to 2GB
BLOB            -- Binary Large Object: up to 4GB

Practical Examples:

CREATE TABLE Documents (
    doc_id         INTEGER PRIMARY KEY,
    doc_name       VARCHAR(255),
    doc_type       VARCHAR(50),        -- 'PDF', 'DOCX', 'JPG'
    doc_size       INTEGER,            -- Size in bytes
    doc_content    VARBINARY(MAX),     -- Actual file content
    doc_hash       BINARY(32),         -- SHA-256 hash (fixed 32 bytes)
    created_at     DATETIME DEFAULT GETDATE()
);
 
CREATE TABLE UserProfiles (
    user_id        INTEGER PRIMARY KEY,
    username       VARCHAR(50),
    profile_image  VARBINARY(MAX),     -- Profile photo
    thumbnail      VARBINARY(MAX),     -- Smaller version
    image_type     VARCHAR(10)         -- 'JPG', 'PNG', 'GIF'
);

Binary Data Considerations:

-- ✅ Good practices
CREATE TABLE FileStorage (
    file_id        INTEGER PRIMARY KEY,
    filename       VARCHAR(255),
    file_size      BIGINT,             -- Size in bytes
    mime_type      VARCHAR(100),       -- Content type
    file_hash      BINARY(32),         -- Integrity check
    file_data      VARBINARY(MAX),     -- Actual content
    upload_date    DATETIME DEFAULT GETDATE()
);
 
-- ❌ Considerations
-- Large binary data can impact performance
-- Consider storing files externally and keeping paths in database
CREATE TABLE FileReferences (
    file_id        INTEGER PRIMARY KEY,
    filename       VARCHAR(255),
    file_path      VARCHAR(500),       -- Path to external file
    file_size      BIGINT,
    mime_type      VARCHAR(100)
);

Specialized Datatypes

Modern databases offer specialized datatypes for specific use cases.

JSON and XML:

-- PostgreSQL
CREATE TABLE ApiLogs (
    log_id         SERIAL PRIMARY KEY,
    request_data   JSON,               -- JSON data
    response_data  JSONB,              -- Binary JSON (faster)
    created_at     TIMESTAMP DEFAULT NOW()
);
 
-- SQL Server
CREATE TABLE ConfigData (
    config_id      INT PRIMARY KEY,
    config_name    VARCHAR(100),
    config_xml     XML,                -- XML data
    config_json    NVARCHAR(MAX)       -- JSON as text
);
 
-- MySQL
CREATE TABLE UserPreferences (
    user_id        INT PRIMARY KEY,
    preferences    JSON                -- Native JSON support
);

Geographic Data:

-- PostgreSQL with PostGIS
CREATE TABLE Locations (
    location_id    SERIAL PRIMARY KEY,
    place_name     VARCHAR(100),
    coordinates    GEOMETRY(POINT, 4326),  -- Latitude/longitude
    boundary       GEOMETRY(POLYGON, 4326) -- Geographic boundary
);
 
-- SQL Server spatial types
CREATE TABLE GeoData (
    id             INT PRIMARY KEY,
    location       GEOGRAPHY,          -- Earth-based coordinates
    shape          GEOMETRY            -- Planar coordinates
);

UUID/GUID:

-- PostgreSQL
CREATE TABLE Sessions (
    session_id     UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id        INTEGER,
    created_at     TIMESTAMP DEFAULT NOW()
);
 
-- SQL Server
CREATE TABLE Transactions (
    transaction_id UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWID(),
    amount         DECIMAL(10,2),
    created_at     DATETIME DEFAULT GETDATE()
);

Datatype Selection Guidelines

Best Practices for Choosing Datatypes:

1. Storage Efficiency:

-- ✅ Right-sized datatypes
CREATE TABLE Inventory (
    item_id        INTEGER,            -- Not BIGINT for small datasets
    quantity       SMALLINT,           -- Not INTEGER for small quantities
    price          DECIMAL(8,2),       -- Not DECIMAL(20,10) for currency
    is_active      BIT                 -- Not INTEGER for boolean
);
 
-- ❌ Oversized datatypes
CREATE TABLE WastedSpace (
    item_id        BIGINT,             -- Wastes 4 bytes per row
    quantity       INTEGER,            -- Wastes 2 bytes per row
    price          DECIMAL(20,10),     -- Unnecessary precision
    is_active      INTEGER             -- Wastes 3 bytes per row
);

2. Data Validation:

-- ✅ Appropriate constraints with datatypes
CREATE TABLE Orders (
    order_id       INTEGER PRIMARY KEY,
    order_date     DATE CHECK (order_date <= GETDATE()),
    ship_date      DATE CHECK (ship_date >= order_date),
    order_total    DECIMAL(10,2) CHECK (order_total >= 0),
    status         VARCHAR(20) CHECK (status IN ('PENDING', 'SHIPPED', 'DELIVERED'))
);

3. Performance Considerations:

-- ✅ Indexable datatypes
CREATE TABLE Customers (
    cust_id        INTEGER PRIMARY KEY,    -- Fast integer index
    cust_code      CHAR(10),              -- Fixed-length, indexable
    cust_name      VARCHAR(100),          -- Variable but reasonable length
    created_date   DATE                   -- Date indexes work well
);
 
-- Create appropriate indexes
CREATE INDEX idx_cust_code ON Customers(cust_code);
CREATE INDEX idx_cust_name ON Customers(cust_name);
CREATE INDEX idx_created_date ON Customers(created_date);

4. Future-Proofing:

-- ✅ Allow for growth
CREATE TABLE Products (
    prod_id        INTEGER,            -- Room for millions of products
    prod_code      VARCHAR(20),        -- Flexible code format
    prod_name      VARCHAR(200),       -- Accommodate long names
    prod_price     DECIMAL(12,2),      -- Handle expensive items
    description    TEXT                -- Unlimited description length
);

Common Datatype Mistakes

Avoid These Common Pitfalls:

-- ❌ Wrong datatype for the data
CREATE TABLE BadChoices (
    phone_number   INTEGER,            -- Loses formatting, leading zeros
    zip_code       INTEGER,            -- "01234" becomes 1234
    ssn            INTEGER,            -- Too large for INTEGER, formatting lost
    price          FLOAT,              -- Rounding errors in currency
    is_deleted     VARCHAR(5),         -- "true"/"false" instead of BIT
    order_date     VARCHAR(20)         -- String instead of DATE
);
 
-- ✅ Correct datatype choices
CREATE TABLE GoodChoices (
    phone_number   VARCHAR(15),        -- Preserves formatting: "(555) 123-4567"
    zip_code       VARCHAR(10),        -- Preserves leading zeros: "01234"
    ssn            CHAR(11),           -- Fixed format: "123-45-6789"
    price          DECIMAL(10,2),      -- Exact currency values
    is_deleted     BIT,                -- True boolean values
    order_date     DATE                -- Proper date handling
);

Cross-Database Compatibility

Portable Datatype Choices:

-- Most compatible across databases
CREATE TABLE PortableTable (
    id             INTEGER,            -- Widely supported
    name           VARCHAR(100),       -- Standard variable text
    description    TEXT,               -- Large text (varies in size limits)
    price          DECIMAL(10,2),      -- Exact numeric
    created_date   DATE,               -- Date only
    is_active      SMALLINT            -- 0/1 for boolean (most compatible)
);

Database Migration Mapping:

| Generic Type | SQL Server | MySQL | PostgreSQL | Oracle |-----|-----|----- | Small Integer | SMALLINT | SMALLINT | SMALLINT | NUMBER(5) | Integer | INT | INT | INTEGER | NUMBER(10) | Large Integer | BIGINT | BIGINT | BIGINT | NUMBER(19) | Decimal | DECIMAL(p,s) | DECIMAL(p,s) | DECIMAL(p,s) | NUMBER(p,s) | Variable Text | VARCHAR(n) | VARCHAR(n) | VARCHAR(n) | VARCHAR2(n) | Large Text | TEXT | TEXT | TEXT | CLOB | Date | DATE | DATE | DATE | DATE | Date/Time | DATETIME | DATETIME | TIMESTAMP | DATE | Boolean | BIT | BOOLEAN | BOOLEAN | NUMBER(1)