{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Lecture 3: Multi-Table SQL\n", "======================" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Joins!\n", "------\n", "Let's illustrate some more complex queries that join two tables together.\n", "\n", "* Consider a table of companies, stock price, and HQ country.\n", "> company(cname, stockprice, country)\n", " \n", "* we'll then revist products and introduce some consistency requirements " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n", "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "drop table if exists product; -- This needs to be dropped if exists, see why further down!\n", "drop table if exists company;\n", "create table company (\n", " cname varchar primary key, -- company name uniquely identifies the company.\n", " stockprice money, -- stock price is in money \n", " country varchar); -- country is just a string\n", "insert into company values ('GizmoWorks', 25.0, 'USA');\n", "insert into company values ('Canon', 65.0, 'Japan');\n", "insert into company values ('Hitachi', 15.0, 'Japan');" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cnamestockpricecountry
GizmoWorks25USA
Canon65Japan
Hitachi15Japan
" ], "text/plain": [ "[(u'GizmoWorks', 25, u'USA'),\n", " (u'Canon', 65, u'Japan'),\n", " (u'Hitachi', 15, u'Japan')]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%sql select * from company;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Foreign Key Constraints\n", "-----------------------\n", "* Suppose that we want to create a products table\n", "\n", "> Product(pname, price, category, manufacturer)\n", "\n", "* Something is odd here: We can have manufacturers that sell products but don't occur in our company table!\n", "* To protect against, this we introduce _foreign keys_ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We say the company name in products _refers_ to cname in company. Let's do it in SQL! the key statement below is:\n", "\n", "> foreign key (manufacturer) references company(cname)\n", "\n", " * Note that cname must be a key in company! \n", " * Keys and Foreign keys come up _all_ the time. \n", " * PKs and FKs are common (others, less so)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n", "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql drop table if exists product;\n", "pragma foreign_keys = ON; -- WARNING by default off in sqlite\n", "create table product(\n", " pname varchar primary key, -- name of the product\n", " price money, -- price of the product\n", " category varchar, -- category\n", " manufacturer varchar, -- manufacturer\n", " foreign key (manufacturer) references company(cname));\n", "\n", "insert into product values('Gizmo', 19.99, 'Gadgets', 'GizmoWorks');\n", "insert into product values('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks');\n", "insert into product values('SingleTouch', 149.99, 'Photography', 'Canon');\n", "insert into product values('MultiTouch', 203.99, 'Household', 'Hitachi');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Indeed foreign keys are a _constraint_ \n", "> What happens if we introduce a company name not in our table?\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(IntegrityError) UNIQUE constraint failed: product.pname u\"insert into product values('MultiTouch', 203.99, 'Household', 'Google');\" ()\n", "Rejected!\n" ] } ], "source": [ "try:\n", " %sql insert into product values('MultiTouch', 203.99, 'Household', 'Google');\n", "except Exception as e:\n", " print e\n", " print \"Rejected!\"" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricecategorymanufacturer
Gizmo19.99GadgetsGizmoWorks
PowerGizmo29.99GadgetsGizmoWorks
SingleTouch149.99PhotographyCanon
MultiTouch203.99HouseholdHitachi
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'Gadgets', u'GizmoWorks'),\n", " (u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks'),\n", " (u'SingleTouch', 149.99, u'Photography', u'Canon'),\n", " (u'MultiTouch', 203.99, u'Household', u'Hitachi')]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "-- the update is rejected!\n", "select * from product;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Foreign Keys and Delete\n", "=============\n", "\n", "* What happens if we delete a company? Three options:\n", " * Disallow the delete. (default)\n", " * Remove all products (add \"`on delete cascade`\")\n", " * A third variant due to NULL\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**First option (default)- delete is disallowed**" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(IntegrityError) FOREIGN KEY constraint failed u\"delete from company where cname = 'Hitachi';\" ()\n", "Disallowed!\n" ] } ], "source": [ "try:\n", " %sql delete from company where cname = 'Hitachi';\n", "except Exception as e:\n", " print e\n", " print \"Disallowed!\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Second option: remove all products belonging to the company we delete**\n", "\n", "Try adjusting the foreign key constraint clause when you create the products table as follows:\n", "> foreign key (manufacturer) references company(cname) on delete cascade\n", "\n", "Now, when a company row is deleted, all of the products linked by the foreign key constraint will be deleted as well." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Joins!\n", "------\n", "> Product (pname, price, category, manufacturer)
\n", "> Company (cname, stockPrice, country)\n", "\n", "The query we want to answer is:\n", "\n", "> Find all products under $200 manufactured in Japan;\n", "> return their names and prices. \n", "\n", "Notice products don't have a location and manufacturers don't have price. Need info in _each_ of the tables." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnameprice
SingleTouch149.99
" ], "text/plain": [ "[(u'SingleTouch', 149.99)]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT pname, price\n", "FROM product, company\n", "where manufacturer=cname and country='Japan' and price <= 200;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's see how to write this join in a modular way." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cname
Canon
Hitachi
" ], "text/plain": [ "[(u'Canon',), (u'Hitachi',)]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql -- Part 1: Set of Japanese companies.\n", "SELECT distinct cname -- do we need distinct?\n", "from company where country='Japan';" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricemanufacturer
Gizmo19.99GizmoWorks
PowerGizmo29.99GizmoWorks
SingleTouch149.99Canon
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'GizmoWorks'),\n", " (u'PowerGizmo', 29.99, u'GizmoWorks'),\n", " (u'SingleTouch', 149.99, u'Canon')]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql -- Part 2: Cheap Products (under $200)\n", "select distinct pname, price, manufacturer\n", "from product\n", "where price <= 200;" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricemanufacturercname
Gizmo19.99GizmoWorksCanon
Gizmo19.99GizmoWorksHitachi
PowerGizmo29.99GizmoWorksCanon
PowerGizmo29.99GizmoWorksHitachi
SingleTouch149.99CanonCanon
SingleTouch149.99CanonHitachi
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'GizmoWorks', u'Canon'),\n", " (u'Gizmo', 19.99, u'GizmoWorks', u'Hitachi'),\n", " (u'PowerGizmo', 29.99, u'GizmoWorks', u'Canon'),\n", " (u'PowerGizmo', 29.99, u'GizmoWorks', u'Hitachi'),\n", " (u'SingleTouch', 149.99, u'Canon', u'Canon'),\n", " (u'SingleTouch', 149.99, u'Canon', u'Hitachi')]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql -- combine them with nested SFW queries... this is a cross product?\n", "SELECT * \n", "FROM \n", " (SELECT DISTINCT pname, price, manufacturer\n", " FROM product\n", " WHERE price <= 200) CheapProducts,\n", " (SELECT DISTINCT cname\n", " FROM company\n", " WHERE country='Japan') JapaneseProducts;" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnameprice
SingleTouch149.99
" ], "text/plain": [ "[(u'SingleTouch', 149.99)]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "-- Combine them as a join!\n", "SELECT DISTINCT pname, price\n", "FROM \n", " (SELECT DISTINCT pname, price, manufacturer\n", " FROM product\n", " WHERE price <= 200) CheapProducts,\n", " (SELECT distinct cname\n", " FROM company\n", " WHERE country='Japan') JapaneseProducts\n", "WHERE cname = manufacturer;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Takeways\n", "--------\n", "* There are potentially _many logically equivalent ways_ to write a query\n", " * This fact will be used later by the query optimizer and in homework!\n", " * On exams, write the simplest thing (break it down in parts?)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Duplicate answers from join\n", "--------------------------\n", "\n", "Note that we can get duplicate answers from a join..." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
country
USA
USA
" ], "text/plain": [ "[(u'USA',), (u'USA',)]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql -- duplicate answer\n", "SELECT Country\n", "FROM Product, Company\n", "WHERE Manufacturer=CName AND Category='Gadgets';" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "NOTE:\n", "-----\n", "\n", "Please run the below cells first before proceeding- you'll need them soon!" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n", "Done.\n", "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "pragma foreign_keys = ON; -- WARNING: by default off in sqlite\n", "drop table if exists product; -- This needs to be dropped if exists, see why further down!\n", "drop table if exists company;\n", "create table company (\n", " cname varchar primary key, -- company name uniquely identifies the company.\n", " stockprice money, -- stock price is in money \n", " country varchar); -- country is just a string\n", "insert into company values ('GizmoWorks', 25.0, 'USA');\n", "insert into company values ('Canon', 65.0, 'Japan');\n", "insert into company values ('Hitachi', 15.0, 'Japan');\n", "create table product(\n", " pname varchar primary key, -- name of the product\n", " price money, -- price of the product\n", " category varchar, -- category\n", " manufacturer varchar, -- manufacturer\n", " foreign key (manufacturer) references company(cname));\n", "insert into product values('Gizmo', 19.99, 'Gadgets', 'GizmoWorks');\n", "insert into product values('SingleTouch', 149.99, 'Photography', 'Canon');\n", "insert into product values('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks');\n", "insert into product values('MultiTouch', 203.99, 'Household', 'Hitachi');" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "DROP TABLE IF EXISTS franchise;\n", "CREATE TABLE franchise (name TEXT, db_type TEXT);\n", "INSERT INTO franchise VALUES ('Bobs Bagels', 'NoSQL');\n", "INSERT INTO franchise VALUES ('eBagel', 'NoSQL');\n", "INSERT INTO franchise VALUES ('BAGEL CORP', 'MySQL');\n", "\n", "DROP TABLE IF EXISTS store;\n", "CREATE TABLE store (franchise TEXT, location TEXT);\n", "INSERT INTO store VALUES ('Bobs Bagels', 'NYC');\n", "INSERT INTO store VALUES ('eBagel', 'PA');\n", "INSERT INTO store VALUES ('BAGEL CORP', 'Chicago');\n", "INSERT INTO store VALUES ('BAGEL CORP', 'NYC');\n", "INSERT INTO store VALUES ('BAGEL CORP', 'PA');\n", "\n", "DROP TABLE IF EXISTS bagel;\n", "CREATE TABLE bagel (name TEXT, price MONEY, made_by TEXT);\n", "INSERT INTO bagel VALUES ('Plain with shmear', 1.99, 'Bobs Bagels');\n", "INSERT INTO bagel VALUES ('Egg with shmear', 2.39, 'Bobs Bagels');\n", "INSERT INTO bagel VALUES ('eBagel Drinkable Bagel', 27.99, 'eBagel');\n", "INSERT INTO bagel VALUES ('eBagel Expansion Pack', 1.99, 'eBagel');\n", "INSERT INTO bagel VALUES ('Plain with shmear', 0.99, 'BAGEL CORP');\n", "INSERT INTO bagel VALUES ('Organic Flax-seed bagel chips', 0.99, 'BAGEL CORP');\n", "\n", "DROP TABLE IF EXISTS purchase;\n", "-- Note that date is an int here just to simplify things\n", "CREATE TABLE purchase (bagel_name TEXT, franchise TEXT, date INT, quantity INT, purchaser_age INT);\n", "INSERT INTO purchase VALUES ('Plain with shmear', 'Bobs Bagels', 1, 12, 28);\n", "INSERT INTO purchase VALUES ('Egg with shmear', 'Bobs Bagels', 2, 6, 47);\n", "INSERT INTO purchase VALUES ('Plain with shmear', 'BAGEL CORP', 2, 12, 24);\n", "INSERT INTO purchase VALUES ('Plain with shmear', 'BAGEL CORP', 3, 1, 17);\n", "INSERT INTO purchase VALUES ('eBagel Expansion Pack', 'eBagel', 1, 137, 5);\n", "INSERT INTO purchase VALUES ('Plain with shmear', 'Bobs Bagels', 4, 24, NULL);" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Order By Reprise\n", "================\n", "* SQL-89 forbids the following, but we can make sense of it:\n", "> SELECT pname FROM Product ORDER BY Price\n", "\n", "Let's try out some other slightly unintuitive ORDER BY queries:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricecategorymanufacturer
Gizmo19.99GadgetsGizmoWorks
SingleTouch149.99PhotographyCanon
PowerGizmo29.99GadgetsGizmoWorks
MultiTouch203.99HouseholdHitachi
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'Gadgets', u'GizmoWorks'),\n", " (u'SingleTouch', 149.99, u'Photography', u'Canon'),\n", " (u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks'),\n", " (u'MultiTouch', 203.99, u'Household', u'Hitachi')]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%sql SELECT * FROM Product;" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnameprice
Gizmo19.99
MultiTouch203.99
PowerGizmo29.99
SingleTouch149.99
" ], "text/plain": [ "[(u'Gizmo', 19.99),\n", " (u'MultiTouch', 203.99),\n", " (u'PowerGizmo', 29.99),\n", " (u'SingleTouch', 149.99)]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql SELECT pname,price FROM Product\n", "ORDER BY pname" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pname
Gizmo
PowerGizmo
SingleTouch
MultiTouch
" ], "text/plain": [ "[(u'Gizmo',), (u'PowerGizmo',), (u'SingleTouch',), (u'MultiTouch',)]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql SELECT pname FROM Product\n", "ORDER BY Price" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pname
Gizmo
PowerGizmo
SingleTouch
MultiTouch
" ], "text/plain": [ "[(u'Gizmo',), (u'PowerGizmo',), (u'SingleTouch',), (u'MultiTouch',)]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql SELECT distinct pname FROM Product\n", "ORDER BY Price" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "* Some engines- like Postgres- would balk at this, but some engines- like SQLite- do process this... how?" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.15" } }, "nbformat": 4, "nbformat_minor": 1 }