Skip to content

A reporting tool built using Python & SQL that summarizes data from a large database.

Notifications You must be signed in to change notification settings

rishi-ramawat/FSND_P3-Logs_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Logs Analysis

A reporting tool that prints out reports (in plain text) based on the data in the database.

Note: This is a solution to project 3 of the Udacity Full-Stack Web Developer Nanodegree Program. In this project, we have to build a reporting tool that runs complex SQL queries on a large database (1.677 million+ records!), and helps us draw business conclusions from data.

Installing Development Pre-Requisites

Installing The Project for Development / Testing on Linux

  • Clone the repository:

    $ git clone [email protected]:rishi-ramawat/FSND_P3-Logs_Analysis.git
    $ cd FSND_P3-Logs_Analysis
  • Initialize the project:

    $ bin/init_project
    • This shell script will create an .env file for you.
    • It will also install/upgrade python-dotenv & psycopg2.
    • Note: You might have to run this command with sudo as it tries to upgrade python packages.
  • Review .env and configure any required variables.

    • Make sure you have correct database credentials in the .env file before trying to run the project.

Installing The Project for Development / Testing on Windows

  • Clone the repo
  • Visit the folder where you have cloned the repo
    • Make a copy .env.example and name it as .env
    • Make sure all the required variables are present & initialized in .env
  • Make sure you have python-dotenv & psycopg2 installed
    • You can run pip3 install -U -r requirements.txt to install/upgrade them automatically.

Setting Up The Database

  • In PostgreSQL create the news database.
    • createdb news command can be used to create the database if PostgreSQL is installed natively on your system.
  • Next, download the newsdata.zip here.
    • You will need to unzip this file after downloading it.
    • The file inside is called newsdata.sql.
  • To load the data, use the command psql -d news -f newsdata.sql.
  • Next, run the following command to create DATABASE VIEWS required to run the app:
    • psql -d news -f sql/create_views.sql

Running the project

Run the following command to generate the reports:

$ python3 app/logs_analysis.py

Project Scenario

You've been hired onto a team working on a newspaper site. The user-facing newspaper site frontend itself, and the database behind it, are already built and running. You've been asked to build an internal reporting tool that will use information from the database to discover what kind of articles the site's readers like.

The database contains newspaper articles, as well as the web server log for the site. The log has a database row for each time a reader loaded a web page. Using that information, your code will answer questions about the site's user activity.

The program you write in this project will run from the command line. It won't take any input from the user. Instead, it will connect to that database, use SQL queries to analyze the log data, and print out the answers to some questions.

The database includes three tables:

  • The authors table includes information about the authors of articles.
  • The articles table includes the articles themselves.
  • The log table includes one entry for each time a user has accessed the site.

Project Highlights

  • This project implements a single query solution for each of the questions in hand.
    • When the application fetches data from multiple tables, it uses a single query with a join, rather than multiple queries.
    • See app/logs_analysis.py for more details.
  • The code conforms to the PEP8 style recommendations.
  • The project outputs reports in Markdown format. See OUTPUT.md for more details.

About

A reporting tool built using Python & SQL that summarizes data from a large database.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published