Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kaiden Jessani - Shopify Application #198

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README_FINAL
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
The "create_and_populate_database.py" program creates and populates a SQL database.
A results folder has been created. It has the expected results for each of the problems.
The "test_sql_queries.py" program executes the three SQL files, and compares the expected results against the actual results. The outcome is displayed in the console.
3 changes: 3 additions & 0 deletions results/problem1
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
product_id,product_name,description,price,category_id
15,Mountain Bike,Conquer the trails with this high-performance mountain bike.,1000.0,8
16,Tennis Racket,Take your tennis game to the next level with this professional-grade racket.,54.0,8
2 changes: 2 additions & 0 deletions results/problem10
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
user_id,username
5,sarahwilson
9 changes: 9 additions & 0 deletions results/problem11
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
product_id,product_name,category_id,MAX(p.price)
1,Smartphone X,1,500.0
3,Laptop Pro,2,1200.0
6,Designer Dress,3,300.0
7,Coffee Maker,4,80.0
9,Action Camera,5,200.0
12,Skincare Set,6,150.0
14,Weighted Blanket,7,100.0
15,Mountain Bike,8,1000.0
1 change: 1 addition & 0 deletions results/problem12
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
user_id,username
31 changes: 31 additions & 0 deletions results/problem2
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
user_id,username,count(o.order_id)
1,johndoe,1
2,janesmith,1
3,maryjones,1
4,robertbrown,1
5,sarahwilson,1
6,michaellee,1
7,lisawilliams,1
8,chrisharris,1
9,emilythompson,1
10,davidmartinez,1
11,amandajohnson,1
12,jasonrodriguez,1
13,ashleytaylor,1
14,matthewthomas,1
15,sophiawalker,1
16,jacobanderson,1
17,olivialopez,1
18,ethanmiller,1
19,emilygonzalez,1
20,williamhernandez,1
21,sophiawright,1
22,alexanderhill,1
23,madisonmoore,1
24,jamesrogers,1
25,emilyward,1
26,benjamincarter,1
27,gracestewart,1
28,danielturner,1
29,elliecollins,1
30,williamwood,1
17 changes: 17 additions & 0 deletions results/problem3
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
product_id,product_name,AVG(r.rating)
1,Smartphone X,5.0
2,Wireless Headphones,4.0
3,Laptop Pro,3.0
4,Smart TV,5.0
5,Running Shoes,2.0
6,Designer Dress,4.0
7,Coffee Maker,5.0
8,Toaster Oven,3.0
9,Action Camera,4.0
10,Board Game Collection,1.0
11,Yoga Mat,5.0
12,Skincare Set,4.0
13,Vitamin C Supplement,2.0
14,Weighted Blanket,3.0
15,Mountain Bike,5.0
16,Tennis Racket,4.0
6 changes: 6 additions & 0 deletions results/problem4
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
user_id,username,sum(o.total_amount)
12,jasonrodriguez,160.0
4,robertbrown,155.0
8,chrisharris,150.0
24,jamesrogers,150.0
17,olivialopez,145.0
6 changes: 6 additions & 0 deletions results/problem5
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
product_id,product_name,avg_rating
1,Smartphone X,5.0
4,Smart TV,5.0
7,Coffee Maker,5.0
11,Yoga Mat,5.0
15,Mountain Bike,5.0
1 change: 1 addition & 0 deletions results/problem6
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
user_id,username
1 change: 1 addition & 0 deletions results/problem7
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
product_id,product_name
1 change: 1 addition & 0 deletions results/problem8
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
user_id,username
4 changes: 4 additions & 0 deletions results/problem9
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
category_id,category_name,total_sales_amount
8,Sports & Outdoors,155.0
4,Home & Kitchen,145.0
1,Electronics,125.0
86 changes: 83 additions & 3 deletions sql/task1.sql
Original file line number Diff line number Diff line change
@@ -1,14 +1,94 @@
-- Problem 1: Retrieve all products in the Sports category
-- Problem 1:
-- Question: Retrieve all products in the Sports category
-- Write an SQL query to retrieve all products in a specific category.
--
-- Main Query:
-- Selects all columns (p.*) from the 'Products' table.
-- Uses a LEFT JOIN to include all products from the 'Products' table, and matching categories from the 'Categories' table based on the category_id.
-- LEFT JOIN:
-- Connects rows from 'Products' to 'Categories' based on the common column category_id.
-- Products that do not have a matching category will still be included in the result with NULL values for category-related columns.
-- WHERE clause:
-- Filters the result to include only rows where the lowercase category_name from the 'Categories' table contains the substring 'sports'.
-- The LIKE operator with % is used for a partial match, and LOWER() is used to perform a case-insensitive comparison.
--
-- Main query to select products from the 'Products' table based on a left join with 'Categories'
SELECT p.* FROM Products p -- Selects all columns from the 'Products' table
LEFT JOIN Categories c
ON p.category_id=c.category_id ---- Left join with 'Categories' table
WHERE LOWER(c.category_name) like '%sports%' -- Filters products where the category name contains 'sports' in a case-insensitive manner

-- Problem 2: Retrieve the total number of orders for each user
-- *************************************************************************************************************************************************************
-- Problem 2:
-- Question: Retrieve the total number of orders for each user
-- Write an SQL query to retrieve the total number of orders for each user.
-- The result should include the user ID, username, and the total number of orders.
--Main Query:
-- Selects the user_id and username from the 'Users' table.
-- Counts the number of orders for each user using the COUNT() function on the order_id.
-- Uses a JOIN operation to connect rows from 'Users' to 'Orders' based on the common column user_id.
-- JOIN clause:
-- Connects rows from 'Users' to 'Orders' based on the common column user_id.
-- GROUP BY clause:
-- Groups the results by user_id, so the count of orders is calculated for each user separately.
--
-- Main query to count the number of orders for each user
SELECT u.user_id, u.username, count(o.order_id) FROM Users u
JOIN Orders o
ON u.user_id=o.user_id -- Joining Users and Orders tables based on user_id
GROUP BY u.user_id -- Grouping the results by user_id

-- Problem 3: Retrieve the average rating for each product
-- *************************************************************************************************************************************************************
-- Problem 3:
-- Question: Retrieve the average rating for each product
-- Write an SQL query to retrieve the average rating for each product.
-- The result should include the product ID, product name, and the average rating.
-- Main Query:
-- Selects the product_id and product_name from the 'Products' table.
-- Calculates the average rating for each product using the AVG() function on the rating column from the 'Reviews' table.
-- Uses a JOIN operation to connect rows from 'Products' to 'Reviews' based on the common column product_id.
-- JOIN clause:
-- Connects rows from 'Products' to 'Reviews' based on the common column product_id.
-- GROUP BY clause:
-- Groups the results by product_id, so the average rating is calculated for each product separately.
--
-- Main query to calculate the average rating for each product based on reviews
SELECT r.product_id, p.product_name, AVG(r.rating) FROM Products p
JOIN Reviews r
ON p.product_id=r.product_id -- Joining Products and Reviews tables based on product_id
GROUP BY r.product_id -- Grouping the results by product_id

-- *************************************************************************************************************************************************************
-- Problem 4: Retrieve the top 5 users with the highest total amount spent on orders
-- Write an SQL query to retrieve the top 5 users with the highest total amount spent on orders.
-- The result should include the user ID, username, and the total amount spent.
--
-- Main Query:
-- Selects the user_id and username from the 'Users' table.
-- Calculates the total amount spent by each user using the SUM() function on the total_amount column from the 'Orders' table.
-- Uses a JOIN operation to connect rows from 'Users' to 'Orders' based on the common column user_id.
-- JOIN clause:
-- Connects rows from 'Users' to 'Orders' based on the common column user_id.
--
-- Groups the results by user_id and username, so the total amount is calculated for each user separately.
-- Orders the results in descending order based on the total amount (3 represents the position of the SUM(o.total_amount) expression in the SELECT list).
-- Limits the result set to the top 5 users by total amount spent.
--
-- Main query to calculate the total amount spent by each user and limit the result to the top 5 users by total amount
SELECT u.user_id, u.username, sum(o.total_amount) FROM Users u
JOIN Orders o
ON u.user_id=o.user_id -- Joining Users and Orders tables based on user_id
GROUP BY u.user_id, u.username -- Grouping the results by user_id and username
ORDER BY 3 DESC -- Ordering the results in descending order based on total amount
LIMIT 5 -- Limit the result to the top 5 users by total amount











77 changes: 73 additions & 4 deletions sql/task2.sql
Original file line number Diff line number Diff line change
@@ -1,19 +1,88 @@
-- Problem 5: Retrieve the products with the highest average rating
-- Problem 5:
-- Question: Retrieve the products with the highest average rating
-- Write an SQL query to retrieve the products with the highest average rating.
-- The result should include the product ID, product name, and the average rating.
-- Hint: You may need to use subqueries or common table expressions (CTEs) to solve this problem.
--
-- avg_rating CTE Calculates the average ratings for each product using the AVG() function
-- The PARTITION BY p.product_id ensures that the average is calculated for each product individually.
WITH
avg_rating AS (
-- Results to display product ID, product name, and the average rating
SELECT p.product_id, p.product_name, AVG(rating) over (PARTITION BY p.product_id) avg_rating FROM Products p
JOIN Reviews r
ON p.product_id=r.product_id
)
--Selects all columns from the AvgRating CTE.
SELECT * FROM avg_rating
WHERE avg_rating = (SELECT MAX(avg_rating) FROM avg_rating) --include only rows where the avg_rating is equal to the maximum average rating

-- *************************************************************************************************************************************************************
-- Problem 6: Retrieve the users who have made at least one order in each category
-- Write an SQL query to retrieve the users who have made at least one order in each category.
-- The result should include the user ID and username.
-- Hint: You may need to use subqueries or joins to solve this problem.
-- CTE DistinctCategories Calculates the count of distinct categories each user has ordered from.
WITH
DistinctCategories AS (
SELECT COUNT(DISTINCT c.category_id) dist_cat_n, u.user_id, u.username FROM Users u
JOIN Orders o --Joins the Users, Orders, Order_Items, Products, and Categories tables
ON u.user_id=o.user_id
JOIN Order_Items oi
ON o.order_id=oi.order_id
JOIN Products p
ON p.product_id=oi.product_id
JOIN Categories c
ON p.category_id=c.category_id
GROUP BY 2,3 -- group the results by user_id and username
)
SELECT user_id, username FROM DistinctCategories
WHERE dist_cat_n = (SELECT COUNT(DISTINCT category_id) FROM Categories) --filters the results to include only users who have ordered from all distinct categories

-- Problem 7: Retrieve the products that have not received any reviews
-- *************************************************************************************************************************************************************
-- Problem 7:
-- Question: Retrieve the products that have not received any reviews
-- Write an SQL query to retrieve the products that have not received any reviews.
-- The result should include the product ID and product name.
-- Hint: You may need to use subqueries or left joins to solve this problem.
--
-- The Main Query Selects product_id and product_name from the Products table.
-- The WHERE clause filters the results to include only products whose product_id is not present in the subquery.
--
-- Main query to select products that have no reviews
SELECT product_id, product_name FROM Products
WHERE product_id not in (SELECT DISTINCT product_id FROM Reviews) -- Subquery to get distinct product IDs from the Reviews table

-- Problem 8: Retrieve the users who have made consecutive orders on consecutive days
-- *************************************************************************************************************************************************************
-- Problem 8:
-- Question: Retrieve the users who have made consecutive orders on consecutive days
-- Write an SQL query to retrieve the users who have made consecutive orders on consecutive days.
-- The result should include the user ID and username.
-- Hint: You may need to use subqueries or window functions to solve this problem.
-- Hint: You may need to use subqueries or window functions to solve this problem.
--
-- CTE orders_with_prev:
-- Calculates the previous order date for each order of each user.
-- The LAG window function is used to obtain the previous order date based on the order date for each user.
-- Main Query:
-- Selects user_id and username from the orders_with_prev CTE.
-- The WHERE clause filters the results to include only users whose orders are on consecutive days,
-- The CAST function is used to convert the Julian day differences to integers.
--
-- CTE named 'orders_with_prev' to calculate the previous order date for each order
WITH orders_with_prev AS (
SELECT
o.user_id,
u.username,
o.order_date,
LAG(o.order_date,1) OVER (PARTITION BY o.user_id ORDER BY o.order_date) previous_order_date
FROM Orders o
JOIN Users u
ON o.user_id=u.user_id
ORDER BY 1,3 -- Order by user_id and order_date
)
-- Main query to select user_id and username from the 'orders_with_prev' CTE
SELECT
user_id,
username
FROM orders_with_prev
WHERE Cast((JulianDay(order_date) - JulianDay(previous_order_date)) As Integer) = 1 -- Filter users with consecutive orders
Loading