{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Lecture 2: Basic Single & Multi-Table SQL\n", "======================" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "'Connected: @None'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%load_ext sql\n", "%sql sqlite://" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Let's create a table, stuff it with data, and query it!" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql drop table if exists product;\n", "create table product(\n", " pname varchar primary key, -- name of the product\n", " price money, -- price of the product\n", " category varchar, -- category\n", " manufacturer varchar NOT NULL -- manufacturer\n", ");\n", "insert into product values('Gizmo', 19.99, 'Gadgets', 'GizmoWorks');\n", "insert into product values('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks');\n", "insert into product values('SingleTouch', 149.99, 'Photography', 'Canon');\n", "insert into product values('MultiTouch', 203.99, 'Household', 'Hitachi');" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Let's look at the products.." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricecategorymanufacturer
Gizmo19.99GadgetsGizmoWorks
PowerGizmo29.99GadgetsGizmoWorks
SingleTouch149.99PhotographyCanon
MultiTouch203.99HouseholdHitachi
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'Gadgets', u'GizmoWorks'),\n", " (u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks'),\n", " (u'SingleTouch', 149.99, u'Photography', u'Canon'),\n", " (u'MultiTouch', 203.99, u'Household', u'Hitachi')]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%sql select * from product;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Some *terminology* for SQL.\n", "--------------------------\n", "* The _name_ of the table is product.\n", "* Each row of the table is called a _row_ or a _tuple_. \n", "* Notice all tuples have the fields or _attributes_.\n", "* The number of rows is called the _cardinality_ while the number of attributes is called the _arity_" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Schema Conventions\n", "-----------------\n", "* The schema of product is written as follows:\n", "\n", "> product(pname, price, category, manufacturer)\n", "\n", "Underlining a set of attributes indicates that they form a _key_.\n", "\n", "* In this case, pname is a key. If the product name was only unique for a given manufacturer, we'd write:\n", "\n", "> product(pname, price, category, manufacturer)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Tables Explained\n", "----------------\n", "* A tuple = a record\n", " * Restriction: all attributes are of atomic type\n", " * There are many atomic data types in SQL engines, look [here](http://www.postgresql.org/docs/9.4/static/datatype.html) for example.\n", "\n", "\n", "* A table = a (multi)-set of tuples\n", " * A multiset is like a list…\n", " * ... but a mutiset is unordered: \n", " * no first(), no next(), no last()." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# Outline\n", "* Create a database -- done!\n", "\n", "* Simple querying -- now!\n", "\n", "* Queries with more than one relation -- next!\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Let the (Simple) querying begin! \n", "* We'll introduce the basics of SQL by example.\n", "* There are many good SQL tutorials on the web, this is intended to get you started." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "> SELECT (attributes)
\n", "> FROM (one or more tables)
\n", "> WHERE (conditions)\n", "\n", "This is the simple SELECT-FROM-WHERE (SFW) block. Let's see some examples!" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricecategorymanufacturer
PowerGizmo29.99GadgetsGizmoWorks
" ], "text/plain": [ "[(u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks')]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT * from Product \n", "WHERE category='Gadgets' and price > 20.0;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Let's give an example of *projection*, i.e., we only retain some attributes from the query. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricemanufacturer
Gizmo19.99GizmoWorks
PowerGizmo29.99GizmoWorks
SingleTouch149.99Canon
MultiTouch203.99Hitachi
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'GizmoWorks'),\n", " (u'PowerGizmo', 29.99, u'GizmoWorks'),\n", " (u'SingleTouch', 149.99, u'Canon'),\n", " (u'MultiTouch', 203.99, u'Hitachi')]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql \n", "SELECT Pname, Price, Manufacturer\n", "FROM Product;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* The output is *still* a table, and its schema is \n", "> Answer(pname, price, manufacturer)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "* Of course, we can combine selection and projection." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricemanufacturer
Gizmo19.99GizmoWorks
PowerGizmo29.99GizmoWorks
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'GizmoWorks'), (u'PowerGizmo', 29.99, u'GizmoWorks')]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT Pname, Price, Manufacturer\n", "FROM Product\n", "WHERE category='Gadgets';" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "The output of a query on a table is again a table \n", "----------------------------------------------\n", "* This is because the query language is *compositional*\n", "* The output of a query really is a table!\n", "* look at this crazy query, what does it ask for?" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
manufacturerpnameprice
GizmoWorksPowerGizmo29.99
" ], "text/plain": [ "[(u'GizmoWorks', u'PowerGizmo', 29.99)]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT * FROM Product;\n", "\n", "SELECT\n", " p.manufacturer, p.pname, p.price\n", "FROM \n", " (SELECT distinct p0.Manufacturer\n", " FROM Product p0\n", " WHERE p0.price < 20.00) cp, -- this is a nested query!\n", " Product p\n", "WHERE \n", " p.manufacturer = cp.manufacturer and p.price > 20.00" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Details on SQL\n", "--------------\n", "\n", "* Some elements are case insensitive (think: program):\n", " * Same: SELECT Select select\n", " * Same: Product product\n", " * Different: ‘Seattle’ ‘seattle’\n", " \n", "\n", "* Constants (single quotes)\n", " * ‘abc’ - yes\n", " * “abc” - no\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "LIKE\n", "====\n", "\n", "The LIKE operator is to search strings, perhaps with wildcards. Format is:\n", " \n", "> SELECT *\n", "> FROM Products\n", "> WHERE pname like '%gizmo%'\n", "\n", "* % matches any number of characters\n", "* \\_ matches one character\n", "* The like operator is case sensitive\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricecategorymanufacturer
Gizmo19.99GadgetsGizmoWorks
PowerGizmo29.99GadgetsGizmoWorks
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'Gadgets', u'GizmoWorks'),\n", " (u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks')]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT * FROM product\n", "where pname LIKE '%Gizmo%'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Eliminating Duplicates\n", "---------------------\n", "* duplicates can sometimes be unwelcome or suprising. \n", " * Recall tables are _multisets_!" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
category
Gadgets
Gadgets
Photography
Household
" ], "text/plain": [ "[(u'Gadgets',), (u'Gadgets',), (u'Photography',), (u'Household',)]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%sql SELECT category from product;" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
category
Gadgets
Photography
Household
" ], "text/plain": [ "[(u'Gadgets',), (u'Photography',), (u'Household',)]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql \n", "-- easy to remove duplicates, use the distinct keyword\n", "SELECT DISTINCT category from product;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Ordering the results\n", "---------------------\n", "* Sometimes you want the results ordered, let's see some examples!\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricemanufacturer
SingleTouch149.99Canon
MultiTouch203.99Hitachi
" ], "text/plain": [ "[(u'SingleTouch', 149.99, u'Canon'), (u'MultiTouch', 203.99, u'Hitachi')]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "-- sometimes we want to order the results.\n", "-- order by is ascending by default!\n", "SELECT pname, price, manufacturer\n", "FROM Product\n", "WHERE price > 50\n", "ORDER BY price, pname" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite://\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pricemanufacturer
149.99Canon
29.99GizmoWorks
19.99GizmoWorks
203.99Hitachi
" ], "text/plain": [ "[(149.99, u'Canon'),\n", " (29.99, u'GizmoWorks'),\n", " (19.99, u'GizmoWorks'),\n", " (203.99, u'Hitachi')]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "-- sometimes we want to order the results.\n", "-- can order like so, each component individually\n", "SELECT price, manufacturer\n", "FROM Product\n", "-- the order is \"dictionary order\" in the clause.\n", "ORDER BY manufacturer ASC, price DESC" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.15" } }, "nbformat": 4, "nbformat_minor": 1 }