Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions 02_activities/assignments/DC_Cohort/Assignment1.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,3 +211,9 @@ Consider, for example, concepts of fariness, inequality, social structures, marg
```
Your thoughts...
```

The issues discussed here are disapointing but frequent issuess in databasing. A lot of variables are continuous, changing, and messy, and that inevitably means it is difficult to describe them using discrete bins. This effort for categorization becomes even more complex when the nuances of human social systems are overlain on them. Even relatively common situations like children born out of wedlock or who were orphaned are unable to be categorized by the system. The situation is made worse by the necessity of having a functioning ID card, where people cannot use basic services without one. It demonstrates an unfortunate mix of a database engineered too rigidly, and a social system too dependent on reality conforming to a specific mold.

It reminds me of a similar (though much less serious) situation I dealt with at my place of work. I volunteered here for several years while in school and was integrated into the various computer and security systems as a volunteer, and recently obtained a position as staff. The systems within the network are not all integrated with each other and that meant that when my status was updated to staff and associated privelages changed in one, other systems would flag a silent error due to the missmatch and I was barred from the whole network without explanation. It was a situation that the business hadn't encountered before and no one had thought about ensuring a person's status can change after joining the network, and that various systems may not respond properly. It was fortunately an easy fix once identified, but demonstrates yet another learning opportunity for system flexibility.

The big take-away from the article is to ensure that databases are flexible and able to accomodate future changes. Most of these future changes are unpredictable until they are encountered, and protocols for flexibility need to be considered early on. I argue it also demonstrates that the initial set-up of the system should be associated with a proof-of-concept to make sure the most frequently encountered edge cases are considered. For example, constructing a database on the geneology of millions of people will inevitably encounter cases of single parents or orphans, and procedures to handle those situations must be considered.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
119 changes: 93 additions & 26 deletions 02_activities/assignments/DC_Cohort/assignment1.sql
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,19 @@
--SELECT
/* 1. Write a query that returns everything in the customer table. */
--QUERY 1



SELECT *
FROM customer;

--END QUERY


/* 2. Write a query that displays all of the columns and 10 rows from the customer table,
sorted by customer_last_name, then customer_first_ name. */
--QUERY 2



SELECT *
FROM customer
ORDER BY customer_last_name,customer_first_name
LIMIT 10;

--END QUERY

Expand All @@ -27,24 +27,25 @@ sorted by customer_last_name, then customer_first_ name. */
/* 1. Write a query that returns all customer purchases of product IDs 4 and 9.
Limit to 25 rows of output. */
--QUERY 3



SELECT *
FROM customer_purchases
WHERE product_id IN (4, 9) -- There are no products with product_id of 9?...
LIMIT 25;

--END QUERY



/*2. Write a query that returns all customer purchases and a new calculated column 'price' (quantity * cost_to_customer_per_qty),
filtered by customer IDs between 8 and 10 (inclusive) using either:
1. two conditions using AND
2. one condition using BETWEEN
Limit to 25 rows of output.
*/
--QUERY 4



SELECT *
,(quantity*cost_to_customer_per_qty) AS price
FROM customer_purchases
WHERE customer_id BETWEEN 8 and 10;

--END QUERY

Expand All @@ -55,9 +56,13 @@ Using the product table, write a query that outputs the product_id and product_n
columns and add a column called prod_qty_type_condensed that displays the word “unit”
if the product_qty_type is “unit,” and otherwise displays the word “bulk.” */
--QUERY 5
SELECT * FROM product; -- look at the product table



SELECT product_id,product_name
,CASE WHEN product_qty_type = 'unit' THEN 'unit'
ELSE 'bulk'
END AS prod_qty_type_condensed
FROM product;

--END QUERY

Expand All @@ -66,8 +71,14 @@ if the product_qty_type is “unit,” and otherwise displays the word “bulk.
add a column to the previous query called pepper_flag that outputs a 1 if the product_name
contains the word “pepper” (regardless of capitalization), and otherwise outputs 0. */
--QUERY 6


SELECT product_id,product_name
,CASE WHEN product_qty_type = 'unit' THEN 'unit'
ELSE 'bulk'
END AS prod_qty_type_condensed
,CASE WHEN LOWER(product_name) LIKE '%pepper%' THEN 1
ELSE 0
END AS pepper_flag
FROM product;


--END QUERY
Expand All @@ -78,9 +89,16 @@ contains the word “pepper” (regardless of capitalization), and otherwise out
vendor_id field they both have in common, and sorts the result by market_date, then vendor_name.
Limit to 24 rows of output. */
--QUERY 7
SELECT * FROM vendor; -- look at the vendor table
SELECT * FROM vendor_booth_assignments; -- look at the table



SELECT * -- I THINK THIS WORKED?.....
FROM vendor
INNER JOIN vendor_booth_assignments
ON vendor.vendor_id = vendor_booth_assignments.vendor_id
ORDER BY vendor_booth_assignments.market_date,
vendor.vendor_name
LIMIT 24;

--END QUERY

Expand All @@ -92,8 +110,18 @@ Limit to 24 rows of output. */
/* 1. Write a query that determines how many times each vendor has rented a booth
at the farmer’s market by counting the vendor booth assignments per vendor_id. */
--QUERY 8


--This query requires joining two tables using an aggregate function, and use the HAVING keyword
SELECT
v.vendor_id
,v.vendor_name -- this pulls vendor_name from vendor to list with vendor_id and booth_rental_count
,COUNT(vba.vendor_id) AS booth_rental_count -- count instances of vendor_id as a new column called booth_rental_count
FROM vendor v
JOIN vendor_booth_assignments vba
ON v.vendor_id = vba.vendor_id
GROUP BY
v.vendor_id
,v.vendor_name
HAVING COUNT(vba.vendor_id) > 0; -- Specify this is done for vendor_id values greater than 0, though the code still runs without it.


--END QUERY
Expand All @@ -106,8 +134,22 @@ of customers for them to give stickers to, sorted by last name, then first name.
HINT: This query requires you to join two tables, use an aggregate function, and use the HAVING keyword. */
--QUERY 9



SELECT
c.customer_id
,c.customer_first_name
,c.customer_last_name
,SUM(p.quantity * p.cost_to_customer_per_qty) AS total_spent
FROM customer c
JOIN customer_purchases p
ON c.customer_id = p.customer_id
GROUP BY
c.customer_id
,c.customer_first_name
,c.customer_last_name
HAVING SUM(p.quantity * p.cost_to_customer_per_qty) > 2000
ORDER BY
c.customer_last_name
,c.customer_first_name;

--END QUERY

Expand All @@ -124,8 +166,21 @@ When inserting the new vendor, you need to appropriately align the columns to be
VALUES(col1,col2,col3,col4,col5)
*/
--QUERY 10
CREATE TABLE temp.new_vendor AS
SELECT *
FROM vendor;

SELECT *
FROM vendor

SELECT *
FROM temp.new_vendor;

INSERT INTO temp.new_vendor
VALUES (10, 'Thomas Superfood Store', 'Fresh Focused', 'Thomas', 'Rosenthal');

SELECT *
FROM temp.new_vendor;


--END QUERY
Expand All @@ -138,8 +193,15 @@ HINT: you might need to search for strfrtime modifers sqlite on the web to know
and year are!
Limit to 25 rows of output. */
--QUERY 11
SELECT *
FROM customer_purchases;


SELECT
customer_id,
strftime('%m', market_date) AS month
,strftime('%Y', market_date) AS year
FROM customer_purchases
LIMIT 25;


--END QUERY
Expand All @@ -152,8 +214,13 @@ HINTS: you will need to AGGREGATE, GROUP BY, and filter...
but remember, STRFTIME returns a STRING for your WHERE statement...
AND be sure you remove the LIMIT from the previous query before aggregating!! */
--QUERY 12


SELECT
customer_id
,SUM(quantity * cost_to_customer_per_qty) AS total_spent
FROM customer_purchases
WHERE strftime('%m', market_date) = '04'
AND strftime('%Y', market_date) = '2022'
GROUP BY customer_id;


--END QUERY