Skip to content

Structured Query Language

Structured Query Language (SQL)

Criteria for an ideal DB Language, which SQL is!

  • Create database and table structures, perform basic data management chores (CRUD)
  • Perform complex queries designed to transform raw data into useful information.
  • Require minimal user effort
  • Easy to learn
    • Fewer than 100 words in command set
    • Declarative, NOT Procedural (i.e. you just declare what you want in SQL and the RDBMS figures out how to get/do your command).
  • Portable
    • ANSI/ISO SQL standard exist, so only minor dialect differences between competing DB Software manufacturers.

Categories of SQL

  • Data Manipulation Language (DML): Manipulates data
    • INSERT, SELECT, UPDATE, DELETE Data Manipulation Language
  • Data Definition Language (DDL): Creates or alters schema and structure
    • CREATE, DROP, ALTER, TRUNCATE, RENAME
  • Transaction Control Language (TCL): Manages multiple DML commands from Bus. Rules in transactions.
    • A transaction is a logical unit of work of set of DML commands.
    • COMMIT, ROLLBACK, SAVEPOINT Data Definition & Transaction Control Language
  • Data Control Language (DCL): Access Control, security and permissions.
    • GRANT, REVOKE Data Control Language

Queries and Operations

Important

Most SQL data manipulation commands operate over an entire table/relation, which is why SQL commands are said to be set-oriented commands. This is important because it allows users to treat SQL commands like building blocks. Once you know the building blocks, you can put them together to declare more complex sets you want.

Intro to Data Types

  • Data Type is basically the KIND of data that an attribute represents.
  • Different RDBMs have similar and different data types, but all DB types fall under 3 categories:

3 Fundamental Types of Data

  • Character data: Any printable characters such as alphabetic values, digits, punctuation, and special characters.
  • Numeric data: digits, such that the data has a specific numeric value.
  • Date data: DATES and Times ;-)

Retrieval (SELECT Query)

Basic Terms

  • SELECT: specifies the attributes to be returned by the query
  • FROM: specifies the table(s) from which the data will be retrieved
  • WHERE: filters the rows of data based on provided criteria
  • GROUP BY: groups the rows of data into collections based on sharing the same values in one or more attributes
  • HAVING: filters the groups formed in the GROUP BY clause based on provided criteria
  • ORDER BY: sorts the final query result rows in ascending or descending order based on the values of one or more attributes.

Basic Structure

1
SELECT [wildcard | column-list] FROM [table-list];
  • Column-list: Relational projection of attributes to return.
  • Table-list: Relations to be joined in query.
  • Wildcard: “*” characters used to designate ALL columns in the relations.

Examples

1
2
# Get all rows and columns from product table.
SELECT * FROM Product;
1
2
# Get all rows but only columns P_CODE, P_DESCRIPT, P_PRICE, and P_QOH  from product table.
SELECT P_CODE, P_DESCRIPT, P_PRICE, P_QOH FROM Product;
1
2
3
# Get all rows but only columns P_CODE, P_DESCRIPT, P_PRICE, and P_QOH  from product table.
# Alias/Rename the resulting columns in the relation to: Id, Description, Price, and Qty.
SELECT P_CODE AS Id, P_DESCRIPT AS Description, P_PRICE AS Price, P_QOH AS Qty FROM Product;
1
2
# Get all rows but only columns P_DESCRIPT, P_QOH, P_PRICE, and derived/calculated column INV_VAL  from product table.
SELECT P_DESCRIPT, P_QOH, P_PRICE, P_QOH * P_PRICE AS INV_VAL FROM PRODUCT;
1
2
3
# Get a list of products and the warranty cutoff date for products
# Swap out Date function as needed based on RDBMS Access: DATE(), SQL Server: GETDATE(), MySQL: CURDATE(), Oracle: SYSDATE 
SELECT P_CODE, P_INDATE, SYSDATE – 90 AS CUTOFF FROM PRODUCT;
1
2
# Select all the UNIQUE vendors that have products (i.e. DISTINCT removes repeats).
SELECT DISTINCT V_CODE FROM PRODUCT;

Other Notes

  • When using Arithmetic Operators in derived/calculated columns PEMDAS applies.
  • Dates are typically stored as a number in RDBMs as the number of days since a specific date in history.
    • Thus Date arithmetic is + or - the number of days.

SQL Arithmetic Operators

Joining

Basic Terms

  • FROM: specifies the table(s) from which the data will be retrieved. Tables joined by PK/FK pairs or a auto selected common attribute..
  • NATURAL JOIN: returns all rows with matching values in the matching columns and eliminates duplicate columns.
    • Joins on attributes with identical names and data-types
    • Joins only common values in chosen attribute(s)
    • Eliminates duplicate common attribute
    • If no common attributes, returns relational product of 2 tables (i.e. each row from table1 duplicated with each row of table2).
    • Usage is discouraged becuase table changes can affect common attribute joins
  • JOIN USING: returns only the rows with matching values in the column indicated in the USING clause.
    • The column must exist in both tables.
    • Eliminates duplicate common attribute
    • Only in MySQL and Oracle
    • Oracle doesn’t allow usage of table qualifiers can’t be used with the common attribute.
  • JOIN ON: joins tables with no common attribute based on equality condition.
    • Attributes equality condition must have the same data-type.
    • Preferred Join method because its the most widely supported.
    • Does NOT eliminate duplicate common attribute(s)
    • table qualifier must be used to referr to duplicate common attributes.
  • OUTER JOINS:
    • LEFT: returns not only the rows matching the join condition, but also the rows in the left table with unmatched values in the right table.
    • RIGHT: returns not only the rows matching the join condition, but also the rows in the right table with unmatched values in the left table.
    • FULL: returns not only the rows matching the join condition, but also the rows with unmatched values in the table on eithr side.
      • Only supported by Oracle and SqlServer
  • CROSS JOIN: Cartesian/Relational Product, joins all rows from table1 to all rows in table2
    • Not supported in Access
  • TABLE ALIAS: A way of renaming a table in a query for 3 common purposes:
    • Shortening names
    • Improving query human readability.
    • Recursive Joins (table joined to itself)

Basic Structure

  • NATURAL JOIN: SELECT [column-list] FROM [table1] NATURAL JOIN [table2]
  • JOIN USING: SELECT [column-list] FROM [table1] JOIN [table2] USING ([common-column])
  • JOIN ON: SELECT [column-list] FROM [table1] JOIN [table2] ON [join-condition]
  • LEFT OUTER JOIN: SELECT [column-list] FROM [table1] LEFT OUTER JOIN [table2] ON [join-condition]
  • RIGHT OUTER JOIN: SELECT [column-list] FROM [table1] RIGHT OUTER JOIN [table2] ON [join-condition]
  • FULL OUTER JOIN: SELECT [column-list] FROM [table1] FULL OUTER JOIN [table2] ON [join-condition]
  • CROSS JOIN: SELECT [column-list] FROM [table1] CROSS JOIN [table2]

Examples

1
2
# Natural join to get all customers and their invoices joined on some auto selected common attribute.
SELECT CUS_CODE, CUS_LNAME, INV_NUMBER, INV_DATE FROM CUSTOMER NATURAL JOIN INVOICE;
1
2
# Join Using to get product to vendor relation USING V_CODE.
SELECT P_CODE, P_DESCRIPT, V_CODE, V_NAME, V_AREACODE, V_PHONE FROM PRODUCT JOIN VENDOR USING (V_CODE);
1
2
# Joins Invoice and Line tables utilizing ON equality clause. 
SELECT INVOICE.INV_NUMBER, PRODUCT.P_CODE, P_DESCRIPT, LINE_UNITS, LINE_PRICE FROM INVOICE JOIN LINE ON INVOICE.INV_NUMBER = LINE.INV_NUMBER JOIN PRODUCT ON LINE.P_CODE = PRODUCT.P_CODE;
1
2
# Demonstrating using table qualifiers to project duplicate common attributes.
SELECT P_CODE, VENDOR.V_CODE, V_NAME FROM PRODUCT JOIN VENDOR ON PRODUCT.V_CODE = VENDOR.V_CODE;
1
2
# Result includes all product rows without matching vendors as well as all vendor rows without matching products
SELECT P_CODE, VENDOR.V_CODE, V_NAME FROM VENDOR FULL JOIN PRODUCT ON VENDOR. V_CODE = PRODUCT.V_CODE;
1
2
# Combines all invoices with all invoice lines.
SELECT * FROM INVOICE CROSS JOIN LINE;
1
2
# Example of a table alias to handle a recursive joining an employee to their manager
SELECT E.EMP_NUM, E.EMP_LNAME, E.EMP_MGR, M.EMP_LNAME FROM EMP E JOIN EMP M ON E.EMP_MGR = M.EMP_NUM;

Sort, Search/Restriction Criteria

Basic Terms

  • ORDER BY: Sorts result rows by list of columns in order from first to last in list.
    • NULLS in order by column list results either come first or last depending on the RDBMS
    • Oracle allows ordering NULLS with ORDER BY column DESC NULLS LAST; or NULLS FIRST
  • WHERE: Adds conditional restrictions to SELECT or search criteria allowing limits to rows in query results.
    • Comparison Operators in Where clause
      • Comparison Operators can be used on character data to compare alphebetically left-to-right.
      • Numbers or Dates stored incorrectly as character data can produce anomalies when comparison operators are used.
    • Logical Operators (AND, OR, and NOT): Allow to test multiple conditions in WHERE condition list
      • AND: Used in WHERE or HAVING clause, and all conditions must be true.
      • OR: Used in WHERE or HAVING clause, and any of the conditions must be true.
      • NOT: negates result of other conditions, turns TRUE to FALSE and vice versa
      • Logical operators can be combined, but must be organized with parenthesis.
  • Old Syle Joins: Joins that have a comma separated list of tables in the FROM clause and the join conditions in the WHERE clause.
    • No longer recommended for 2 major reasons
      • Splits joins between FROM and WHERE clauses making them less human readable.
      • Susceptable to undetected errors because RDBMS can’t determine if JOIN clauses actually exist syntactically.
Note

Most RDBMS search criteria are case sensitive, so its a good idea to wrap column with UPPER() functions before performing comparisons in the following Special Operators.

  • Special Operators:
    • BETWEEN: Used to check if value is in a range (Always list LOW test value first to avoid Empty results).
    • IN: Used to check if value is in a list.
      • Used heavily in subqueries as a subquery can return the list of things to compare to.
    • LIKE: Used to match a character pattern
      • Like can use multiple wild-cards in its matches.
      • ’%’ is a wild-card strings to match anything before, between, or after its appearance.
      • ‘_’ is a wild-card to match just a single character wherever it appears.
      • Access uses ‘*’ and ‘?’ instead of ‘%’ and ‘_’
    • IS NULL: Used to detect NULL values specifically.
      • This operator is needed because using the = operator to compare to NULL is undefined, meaning it will always be false, thus yeilding empty results..

Basic Structure

  • ORDER By: SELECT [column-list] FROM [table-list] ORDER BY [column-list ASC | DESC];
  • WHERE: SELECT [columnlist] FROM [tablelist] WHERE [conditionlist] ORDER BY [columnlist ASC | DESC];
  • BETWEEN: SELECT [columnlist] FROM [tablelist] WHERE [column] (NOT) BETWEEN [min value] AND [max value];
  • IN: SELECT [columnlist] FROM [tablelist] WHERE [column] (NOT) IN ( [comma-sep-value-list] );
  • LIKE: SELECT [columnlist] FROM [tablelist] WHERE UPPER( [column] ) (NOT) LIKE [‘pattern to match’];
  • IS NULL: SELECT [columnlist] FROM [tablelist] WHERE [column] IS (NOT) NULL;

Examples

1
2
3
# Get list of products sorted by Price from High to Low.
SELECT P_CODE, P_DESCRIPT, P_QOH, P_PRICE
  FROM PRODUCT ORDER BY P_PRICE DESC;
1
2
3
4
# Cascading order sequence lname, fname, then middle initial
SELECT EMP_LNAME, EMP_FNAME, EMP_INITIAL, EMP_AREACODE, EMP_PHONE
  FROM EMPLOYEE
ORDER BY EMP_LNAME, EMP_FNAME, EMP_INITIAL;
1
2
3
4
# Order by with derived column/attribute
SELECT P_CODE, P_DESCRIPT, V_CODE, P_PRICE * P_QOH AS TOTAL
  FROM PRODUCT
ORDER BY V_CODE, TOTAL DESC;
1
2
# Select product info for all vendors except 21344
SELECT P_DESCRIPT, P_QOH, P_PRICE, V_CODE FROM PRODUCT WHERE V_CODE <> 21344;
1
2
# Date Query MSSQL
SELECT P_DESCRIPT, P_QOH, P_MIN, P_PRICE, P_INDATE FROM PRODUCT WHERE P_INDATE >= '20-Jan-2018';
1
2
# Date Query MySQL
SELECT P_DESCRIPT, P_QOH, P_MIN, P_PRICE, P_INDATE FROM PRODUCT WHERE P_INDATE >= '2018-01-21';
1
2
# Date Query Access
SELECT P_DESCRIPT, P_QOH, P_MIN, P_PRICE, P_INDATE FROM PRODUCT WHERE P_INDATE >= #20-Jan-2018#;
1
2
3
4
5
# Get all the products from Vendor 25595 and 24288 that are over $100
SELECT P_DESCRIPT, P_PRICE, V_CODE
  FROM PRODUCT WHERE (V_CODE = 25595 OR V_CODE = 24288) AND P_PRICE > 100;
# Or with IN operator
SELECT * FROM PRODUCT WHERE V_CODE IN ('25595', '24288') AND P_PRICE > 100;
1
2
# Get vendors with a contact where the last name contains OS as the 2nd and 3rd characters.
SELECT V_NAME, V_CONTACT, V_AREACODE, V_PHONE FROM VENDOR WHERE UPPER(V_CONTACT) LIKE '_OS%';
1
2
# Get all products that don't have a vendor (i.e. NULL vendor).
SELECT P_CODE, P_DESCRIPT, V_CODE FROM PRODUCT WHERE V_CODE IS NULL;

Aggregation

  • Most aggregate functions take a single attribue as a parameter.
  • Aggregate functions generally take entire collection of rows from an attribute and reduce it to a single rowed result.

Aggregate Functions

Basic Terms

  • COUNT typically doesn’t count NULLs, so you’ll want COUNT(*) if you want a row count.
  • COUNT can be combined with DISTINCT to only count unique values e.g. SELECT COUNT(DISTINCT V_CODE) AS CNT FROM PRODUCT;
  • MAX and MIN: Returns the maximum or minimum value of an attribute column.
    • MAX and MIN are also great for getting oldest or newest dates
  • SUM The sum of the values of a specified column
    • Aggreget functions like SUM and AVG can also take derived values as their parameter, like (P_PRICE * P_QTY), which is the price of all QTY on hand.
  • AVG The average of the values of a specified column
  • GROUP BY: Groups rows into smaller collections by a column list
    • When used, Aggregate functions will operate on the grouped collections (i.e. creates frequency distributions).
    • Treats all NULLs as if they are the same when grouping.
    • Adding additional attributes to the Group By clause can cause the groups to change.
  • HAVING: Allows for restricting data based on an aggregate value (i.e. a group by attribute value).
    • Operates simlar to WHERE clause in standard SELECT, but WHERE applies to attributes of individual rows and HAVING applies to output of GROUP BY.
    • WHERE clauses execute before GROUP BY clause, therefore they can not affect aggregating or have aggregate functions
    • HAVING executes after GROUP BY
    • HAVING can not restrict some rows and leave others it affects all rows.
    • HAVING clauses often contain Aggregate Functions because they affect all rows and execute after GROUP BY.

Basic Structure

  • GROUP BY: SELECT [columnlist] FROM [tablelist] WHERE [conditionlist] GROUP BY [columnlist] HAVING [group conditionlist] ORDER BY [columnlist ASC | DESC]

Examples

1
2
# Get a list of all unique vendors you sell products from
SELECT COUNT(DISTINCT V_CODE) AS "COUNT DISTINCT" FROM PRODUCT;
1
2
# Get the highest and lowest prices in the product table
SELECT MAX(P_PRICE) AS MAXPRICE, MIN(P_PRICE) AS MINPRICE FROM PRODUCT;
1
2
# Get Total still owed by customers
SELECT SUM(CUS_BALANCE) AS TOTBALANCE FROM CUSTOMER;
1
2
# Get the average price of products in the product table
SELECT AVG(P_PRICE) AS AVGPRICE FROM PRODUCT;
1
2
# Get average price of products by vendor
SELECT V_CODE, AVG(P_PRICE) AS AVGPRICE FROM PRODUCT GROUP BY V_CODE;
1
2
3
4
# Get a count of products and averagre prices for each vendor
SELECT V_CODE, V_NAME, COUNT(P_CODE) AS NUMPROD, AVG(P_PRICE) AS AVGPRICE
  FROM PRODUCT JOIN VENDOR ON PRODUCT.V_CODE = VENDOR.V_CODE
GROUP BY V_CODE, V_NAME ORDER BY V_NAME;
1
2
3
4
5
6
# Get the number of products in inventory by vendor that have a price average less than $10
SELECT V_CODE, V_NAME, COUNT(P_CODE) AS NUMPRODS
  FROM PRODUCT
GROUP BY V_CODE
  HAVING AVG(P_PRICE) < 10
ORDER BY V_CODE;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Joins the product and vendor tables using V_CODE as the common attribute
# Restricts to only the rows with a discount greater than 0
# Groups the remaining rows into collections based on V_CODE and V_NAME
# Aggregates the total cost of products in each group
# Restricts to only the groups with totals that exceed $500
# Lists the results in descending order by the total cost
SELECT V_CODE, V_NAME, SUM(P_QOH * P_PRICE) AS TOTCOST
FROM PRODUCT JOIN VENDOR ON PRODUCT.V_CODE = VENDOR.V_CODE
WHERE P_DISCOUNT > 0
GROUP BY V_CODE, V_NAME
HAVING (SUM(P_QOH * P_PRICE) > 500)
ORDER BY SUM(P_QOH * P_PRICE) DESC;

Subqueries and Preprocessing

  • Subqueries are used to process data from other processed data.
  • Subqueries can be used in DML operations INSERT, UPDATE, and DELETE.
  • Subqueries are always on right side of comparison or assignment expression.
  • Subqueries can return a single value (1 col, 1row) or list of values (1 col, n rows) or **virtual table(m col, n rows)
    • single value is the most common subquery and is most often used in conditional expressions (i.e. >, <, =, >=, <=) in the WHERE clause.
      • The value returned must match the data-type of the comparison attribute on the LHS of the comparison.
    • list of values often used with IN, ALL, or ANY operators in a WHERE clause.
    • virtual tables often used in DML opearations or nested SELECTs.
  • Subqueries can be utilized in WHERE or HAVING classes.
  • You can use expressions (i.e. (A * B) >= (subquery) ) instead of attribute columns in subqueries, and anywhere column attributes are expected.

Basic Characteristics

  • A subquery is a query (SELECT statement) inside another query.
  • A subquery is normally expressed inside parentheses.
  • The first query in the SQL statement is known as the outer query.
  • The query inside the SQL statement is known as the inner query.
  • The inner query is executed first.
  • The output of an inner query is used as the input for the outer query.
  • The entire SQL statement is sometimes referred to as a nested query.

Basic Terms

  • Subqueries with WHERE and HAVING (i.e. list of valuels)
    • IN: list equality comparison
    • ALL: compares a single value with a list via comparison operator other than “=” (i.e. WHERE (A * B) > ALL (subquery that returns a list)).
    • ANY: compares a single value with a list via comparison operator and selects ANY row that satisfies conditon (i.e. WHERE (A * B) > ANY (subquery that returns a list)).
  • Subqueries with FROM (i.e. virtual tables)
    • SELECT FROM t1 JOIN (SELECT …) AS t2
  • Correlated Subqueries
    • Subqueries that use data from Outer query and execute for each row.
  • EXISTS checks if a subquery returns rows
    • Can be use with correlated or uncorrelated subqueries
    • USUALLY used with correlated subqueries

Basic Structure

Examples

1
2
3
4
5
# 1 way to get all vendors that haven't provided products
SELECT V_CODE, V_NAME FROM PRODUCT RIGHT JOIN VENDOR ON PRODUCT.V_CODE = VENDOR.V_CODE WHERE P_CODE IS NULL;
# Subquery method to get all vendors that haven't provided products
# Sub query answers, get all vendors that do provide products.
SELECT V_CODE, V_NAME FROM VENDOR WHERE V_CODE NOT IN (SELECT V_CODE FROM PRODUCT WHERE V_CODE IS NOT NULL);
1
2
3
# Get a list of products with price greater than the average product price.
# Sub query answers, what is average price for all products
SELECT P_CODE, P_PRICE FROM PRODUCT WHERE P_PRICE >= (SELECT AVG(P_PRICE) FROM PRODUCT);
1
2
# Find the product with the greatest inventory value.
SELECT * FROM PRODUCT WHERE P_QOH * P_PRICE = (SELECT MAX(P_QOH * P_PRICE) FROM PRODUCT);
1
2
3
4
5
6
7
# Find customers who ordered a claw hammer (i.e. subquery used w/ joins)
SELECT DISTINCT CUS_CODE, CUS_LNAME, CUS_FNAME
  FROM CUSTOMER JOIN INVOICE USING (CUS_CODE)
                JOIN LINE USING (INV_NUMBER)
                JOIN PRODUCT USING (P_CODE)
  WHERE
    P_CODE = (SELECT P_CODE FROM PRODUCT WHERE P_ DESCRIPT = 'Claw hammer');
1
2
3
4
5
6
7
8
9
# Find all customers who ordered hammers, saws, or saw blades (i.e. need to use IN for subquery)
SELECT
DISTINCT CUSTOMER.CUS_CODE, CUS_LNAME, CUS_FNAME
FROM
CUSTOMER JOIN INVOICE ON CUSTOMER.CUS_CODE = INVOICE.CUS_CODE
         JOIN LINE ON INVOICE.INV_NUMBER = LINE.INV_NUMBER
         JOIN PRODUCT ON LINE.P_CODE = PRODUCT.P_CODE
WHERE P_CODE IN (
  SELECT P_CODE FROM PRODUCT WHERE P_DESCRIPT LIKE '%hammer%' OR P_DESCRIPT LIKE '%saw%');
1
2
3
# List all products with a total quantity sold greater than the average quantity sold
SELECT P_CODE, SUM(LINE_UNITS) AS TOTALUNITS FROM LINE
GROUP BY P_CODE HAVING SUM(LINE_UNITS) > (SELECT AVG(LINE_UNITS) FROM LINE);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Select all customers who purchased 2 different products (i.e. subquery in FROM, virtual tables).
SELECT DISTINCT CUSTOMER.CUS_CODE, CUSTOMER.CUS_LNAME
FROM CUSTOMER JOIN
        (SELECT INVOICE.CUS_CODE FROM INVOICE JOIN LINE ON
         INVOICE.INV_NUMBER = LINE.INV_NUMBER
         WHERE P_CODE = '13-Q2/P2') CP1
     ON CUSTOMER.CUST_CODE = CP1.CUS_CODE
     JOIN
        (SELECT INVOICE.CUS_CODE FROM INVOICE JOIN LINE ON
        INVOICE.INV_NUMBER = LINE.INV_NUMBER
        WHERE P_CODE = '23109-HB') CP2
     ON CP1.CUS_CODE = CP2.CUS_CODE;
1
2
3
4
# Get the difference in the price of a product and its averate (i.e. inline subquery, single value, can't use alias in derived columns).
SELECT P_CODE, P_PRICE, (SELECT AVG(P_PRICE) FROM PRODUCT) AS AVGPRICE,
       P_PRICE – (SELECT AVG(P_PRICE) FROM PRODUCT) AS DIFF
FROM PRODUCT;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Get all product sales in which the units sold value is greater than the average units sold value for that product (as opposed to the average for all products):
# 1. Compute the average units sold for a product.
# 2. Compare the average computed in Step 1 to the units sold in each sale row, and then select only the rows in which the number of units sold is greater.
SELECT INV_NUMBER, P_CODE, LINE_UNITS,
      (SELECT AVG(LINE_UNITS)
       FROM LINE LX
       WHERE LX.P_CODE = LS.P_CODE) AS AVG
FROM LINE LS
WHERE LS.LINE_UNITS > (SELECT AVG(LINE_UNITS)
                       FROM LINE LA
                       WHERE LA.P_CODE = LS.P_CODE);
1
2
3
4
5
6
7
8
# Get the vendor code and vendor name for products with a quantity on hand that is less than double the minimum quantity (i.e. EXISTS on correlated query example).
# 1. The inner correlated subquery runs using the first vendor.
# 2. If any products match the condition (the quantity on hand is less than double the minimum quantity), the vendor code and name are listed in the output.
# 3. The correlated subquery runs using the second vendor, and the process repeats itself until all vendors are used.
SELECT V_CODE, V_NAME
FROM VENDOR
WHERE EXISTS (
    SELECT * FROM PRODUCT WHERE P_QOH < P_MIN * 2 AND VENDOR.V_CODE = PRODUCT.V_CODE);

Data Manipulation Functions

  • Functions can decompose data elements (i.e. parts of date data).
  • Functions can convert data element type (i.e. parsing integers).
  • Functions VARY greatly between DBMS vendors.
  • Functions always use a numerical, date, or string value.
  • A function may appear anywhere in a SQL statement where a value or an attribute can be used.
  • There are many types of SQL functions, such as arithmetic, trigonometric, string, date, and time functions.

Common Function Types

  • Date and Time Functions:
    • SQL Server: GETDATE, YEAR, MONTH, DAY, DATEDIFF
    • ORACLE: TO_DATE, SYSDATE, ADD_MONTHS, LAST_DAY
    • MySQL: Date_Format, YEAR, MONTH, DAY, DATE_ADD
  • Conversion Functions
    • SQL Server: CONVERT, CAST, CASE
    • ORACLE: CAST, CASE, TO_CHAR, TO_DATE, TO_NUMBER, DECODE
    • MySQL CONVERT, CAST, CASE
  • Numeric Functions
    • ALL DBMS: ABS, ROUND, CEIL, FLOOR
  • String Functions
    • SQL Server: Concatenation is +, UPPER, LOWER, SUBSTRING, LEN
    • ORACLE: Concatenation is ||, UPPER, LOWER, SUBSTR, LENGTH
    • MySQL : CONCAT, UPPER, LOWER, SUBSTR, LENGTH

Relational Set Operators

Basic Terms

  • UNION: Same as relational set union.
  • INTERSECT: Same as relational set intersect.
  • EXCEPT/MINUS: Similar to relational set difference.

Basic Structure

  • UNION: query UNION query
  • INTERSECT: query INTERSECT query
    • If DBMS has NO INTERSECT: SELECT some_att FROM some_tab WHERE some_att IN (SELECT DISTINCT some_att FROM some_other_tab);
  • EXCEPT/MINUS: query MINUS query, query EXCEPT query
    • If DBMS has NO MINUS/EXCEPT: SELECT some_att FROM some_tab WHERE some_att NOT IN (SELECT DISTINCT some_att FROM some_other_tab);

Examples

1
2
3
4
5
6
7
8
# Combine 2 separate customer tables as one, but eliminate duplicates (UNION ALL if you want duplicates).
# Column data-types and count must match.
# WARNING: Some DBMSs don't eliminate duplicates.
SELECT CUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE
FROM CUSTOMER
UNION
SELECT CUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE,
CUS_PHONE FROM CUSTOMER_2;
1
2
3
4
5
6
7
# Find all the common customer rows between the separate customer tables.
# Column data-types and count must match.
SELECT CUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE
FROM CUSTOMER
INTERSECT
SELECT CUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE,
CUS_PHONE FROM CUSTOMER_2;
1
2
3
4
5
6
7
# Find all the customers in Customer table not in Customer_2
# MINUS is Oracle specific, SQL Server uses EXCEPT
SELECT CUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE
FROM CUSTOMER
MINUS
SELECT CUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE,
CUS_PHONE FROM CUSTOMER_2;

Key Query Principles

  • Know Your Data: Data in the real-world is often not well formed (i.e. missing PKs, FKs, Unormalized), so to be productive, you have know know it inside out.
  • Know the Problem: Make sure the person asking the question knows what they are asking for because there are different ways to interpret data.
  • Build One Clause at a Time: If you know your data and understand the problem, map out your query in this order: FROM, WHERE, GROUP BY, HAVING, SELECT, ORDER BY
    • This will help break the problem down into manageable single clauses.