Data Analysis With SQL#
In this learning guide, you’ll be using MySQL and MySQL Workbench to explore, manipulate, and analyze data.
Along the way, you will learn how to:
✅ Install and navigate MySQL Workbench to manage databases efficiently
✅ Import datasets and create tables to organize your data
✅ Write SQL queries to extract, filter, and summarize data
✅ Save queries and export tables or results for further use
✅ Apply basic data analysis techniques to uncover trends and insights
✅ Document work in a clear and structured format for sharing and reporting
This guide includes exercises using both a sample dataset you create and an external dataset you will import to practice your SQL skills.
If you have any questions while going through this guide, feel free to email the author, Pranav Rajeev
Section 1: SQL Introduction#
Structured Query Language (SQL) is the standard language used to manage and analyze data stored in relational databases. It lets you do things like:
Retrieve data
Manipulate data
Analyze data
Prepare data for dashboards, reports or further analysis
A Relational Database Management System (RDBMS) is software that stores, organizes and retrieves data in a relational form. SQL is the language you use to communicate with and manage RDBMS’s. The five most common are MySQL, PostgreSQL, Microsoft SQL Server, Oracle Database and SQLite. The main differences between these come down to features, syntax/functions and cost.
For this handbook, we’ll look at MySQL because:
It’s beginner friendly but still powerful
It’s free and easy to install
Most concepts learned in MySQL can be applied to others with minor syntax changes
Section 2: Getting Started#
Downloading MySQL#
Windows:
Step 1: Go to https://dev.mysql.com/downloads/installer/
Select version and OS and download the larger file
Click ‘no thanks, just start my download’
Step 2: Run the installer and follow the setup process
For setup type, unless you have specific needs or know exactly which one you want, just select ‘Full’ setup type
Continue following instructions, clicking next/execute
Setup password when prompted and continue on
On the installation complete page, check ‘Start MySQL Workbench after setup.’ Starting MySQL Shell is not required right now unless you want to explore it.
Finish installation
IOS:
Step 1: Go to https://dev.mysql.com/downloads/mysql/
Select version (8.0.43) and OS and download the DMG Archive
Click ‘no thanks, just start my download’
Launch the downloaded package
Step 2: Follow download process
Continue/agree with prompts
Enter password when prompted
Step 3: Go to https://dev.mysql.com/downloads/workbench/
Select version and OS and download the DMG Archive
Click ‘no thanks, just start my download’
Once finished downloading, open it and drag the MySQL Workbench Icon into the Applications folder
Now, MySQL Workbench should be installed onto your macbook. Open MySQL Workbench (from app launcher or spotlight)
Section 3: Basic Data Analysis#
Databases#
You can think of a database as a folder. It acts as a container. Tables would be the files within the database (folder).
Creating a Database:
Step 1: In your query editor, write the query (replace *_* with your db name)
CREATE DATABASE *your_database_name*; |
---|
When you make MySQL queries, there is no case sensitivity so you don’t need to worry about capitalization of anything. A good format to follow, however, is using uppercase for keywords like CREATE DATABASE or SELECT (clause that you will learn soon)
Step 2: Click on the lightning bolt button (#5 in the legend from earlier) to execute the statement
You will see in your output (7) that the action was successful (green check)
Step 3: Select the Schemas tab (4) and hit the refresh button beside where it says schemas. You should now see your new database.
Use the database:
There are two ways:
Method 1: Right click on the database and click on ‘Set as Default Schema’
Method 2: In your query editor, write the query (replace *_* with your db name)
USE *your_database_name*; |
---|
Drop the database:
Step 1: In your query editor, write the query (replace *_* with your db name)
DROP DATABASE *your_database_name*; |
---|
Step 2: Click on the lightning bolt button (#5 in the legend from earlier) to execute the statement
You will see in your output (7) that the action was successful (green check)
Step 3: Select the Schemas tab (4) and hit the refresh button beside where it says schemas. You should not see your database now.
Saving Queries#
To save your query, click on the save icon in the tool bar (near lightning bolt) ->
Alternatively, you could click on file → save script as
To create a new query tab, click on the new file icon (1) on the top left
Important Notes About Running Queries
End each SQL statement with a semicolon (;). This tells SQL that the current statement is complete.
When you run a script, MySQL Workbench only executes the statement your cursor is currently inside (or the text you have highlighted).
If your cursor is on the second line, only that statement will run, not the entire file.
To run all queries in the file, highlight them all first, then click the Execute button (⚡).
For this project, you can do all the learning queries in 1 tab if you’d like, however, do the exercises in separate tabs.
Exercise #1#
Make a new tab. Create a database and use it. Save this script into a folder for the work you’ll do in this handbook.
Creating Tables#
Step 1: In your query editor, write the query (replace *_* with your table and column names where required)
CREATE TABLE *your_table_name* ( *column_1_name* *column_type*, *column_2_name* *column_type*, … *column_i_name* *column_type*, ); |
---|
Here’s an example:
CREATE TABLE personal_info ( personal_id INT, first_name VARCHAR(50), last_name VARCHAR(50), age INT, gender VARCHAR(10), birth_date DATE ); |
---|
Step 2: Run the code and hit the refresh button in the schemas tab. You should see your table in your database now.
Exercise #2#
Make a new tab. Create a table for a list of employees. Include at least 5 columns with at least 3 different data types.
*For a list of data types (Int, Varchar, etc.) go to the syntax tab*
Inserting Rows Into Tables#
Step 1:
In your query editor, write the query (replace *_* with your table and column names where required)
INSERT INTO *your_table_name* VALUES (*column_1_entry, column_2_entry, …, column_i_entry); |
---|
Here’s an example for the table we created earlier:
INSERT INTO personal_info VALUES (1, ‘Michael’, ‘Scott’, 40, ‘Male’, ‘1965-03-15’), (2, ‘Walter’, ‘White’, 50, ‘Male’, ‘1958-09-07’), (3, ‘Elizabeth’, ‘Keen’, 30, ‘Female’, ‘1985-12-20’); |
---|
To view your table, in your query editor, write the query
SELECT * FROM *your_table_name* |
---|
For the table I created earlier:
SELECT * FROM personal_info; |
---|
Exercise #3#
In the tab from exercise 2. insert at least 5 rows to your table of employees. Use the SELECT statement to view your table.
Writing and Running Basic Queries#
Select Statement#
To view your table, in your query editor, write the query
SELECT * FROM *your_table_name* |
---|
To view only specific columns, in your query editor, write the query
SELECT column_name_1, column_name_2, …, column_name_i FROM *your_table_name* |
---|
For the example earlier, we can show the table with only the columns name and age
SELECT first_name, age FROM personal_info; |
---|
Exercise #4#
In the tab you used for exercise 2, try showing only three of the columns that are in your dataset.
Filtering (Where Statement)
To extract only the records that fulfill a specified condition, we use the WHERE statement with your SELECT statement
SELECT * FROM *your_table_name* WHERE condition |
---|
For the example earlier, we can show the table with only the columns name and age
SELECT first_name, last_name, age FROM personal_info WHERE gender = ‘female’ |
---|
There are lots of formats for writing conditions. You can use Basic comparison operators (=, != (or <>), <, >, <=, >=), Multiple conditions with AND/OR: (WHERE gender = ‘Male’ AND age > 45;) and even do things like pattern matching with the LIKE operator. If you want to find more operators like this, check out the ‘Common Syntax’ tab for clauses like ‘In’, ‘Between’, ‘Like’ and ‘Regex’.
Exercise #5#
In the same tab as exercise 2, create a filter using at least 1 AND/OR.
Sorting Data (Order By Statement)
To sort the rows returned by your query, use the ORDER BY clause. You can sort results in ascending order (default) or descending order by adding DESC.
SELECT * FROM *your_table_name* ORDER BY column_name (DESC if needed); |
---|
For the example earlier, we can order the table by age (ascending):
SELECT first_name, last_name, age FROM personal_info ORDER BY age; |
---|
Or we can order the table by age (descending):
SELECT first_name, last_name, age FROM personal_info ORDER BY age DESC; |
---|
Exercise #6#
In the same tab as exercise 2, order your dataset by a column in descending order.
Limiting Results (Limit Statement)
To return only a certain number of rows, use the LIMIT clause. This is helpful when working with large datasets.
SELECT * FROM *your_table_name* LIMIT number_of_rows; |
---|
For the example earlier, we can limit to only 2 rows:
SELECT * FROM personal_info LIMIT 2; |
---|
Exercise #7#
In the same tab as exercise 2, use the select function while limiting results to only 3 rows.
You can also stack all these clauses.
SELECT * FROM *your_table_name* WHERE condition ORDER BY column_name LIMIT number_of_rows; |
---|
Exercise #8#
In the same tab as exercise 2, show your dataset while using all of the SELECT, WHERE, ORDER BY and LIMIT clauses. Now save this script with exercises 2-7 into your folder.
Importing Sample Data And Table#
Download Data From Kaggle#
Let’s start by grabbing our dataset from Kaggle, an awesome free platform full of real-world data.
If you’re new to Kaggle, click Register—you can sign up with your Google account or email. Already have an account? Just sign in.
For this project, we’ll be using the dataset: Data Analyst Job Roles in Canada
To download:
Click the Download button
Leave all settings as default as shown below:
Then click Download dataset as ZIP
To import the sample dataset, follow these steps:
Step 1: In the Schemas Panel, right-click on your schema (*your_database_name*)
Step 2: Select Table Data Import Wizard.
Step 3: Browse for your dataset file (Cleaned_Dataset.csv for ours) and click Next.
Step 4: Choose the target destination where the data will be imported. If you already created a table for it, choose ‘Use Existing Table.’ Otherwise, select ‘Create New Table’ and choose the database and create a table name.
Step 5: Workbench will automatically suggest column names and data types based on the CSV. Adjust if needed (e.g., change numbers to INT or DECIMAL, dates to DATE).
Step 6: Click Next → Finish.
Exercise #9#
On a new tab, use the clauses we’ve learned so far to analyze the dataset. Use 3 new clauses from the ‘Common Syntax’ tab to further analyze your dataset. Save this script to your folder.
Section 3: Final Documentation#
Based on the analysis you conducted, write a short summary on Google Docs (around 3–5 sentences) answering questions like:
How do you find SQL syntax?
What type of insights did you get from analyzing the dataset?
Did anything surprise you?
Section 5: Save & Upload Your Work#
Create a folder in Google Drive and name it: YourName_SQL_Queries
Upload:
Your database
Your queries
Your summary
✅ Make sure everything is clearly labeled!
One you’re done, share your folder with Vicky Huang