Lab 2: EC2 Setup and VSCode Remote Development
DSAN 6000 Fall 2025
Make sure that you are connected to the Saxanet WiFi network and not the GuestNet network. SSH (TCP port 22) is blocked on GuesNet which means if you are on GuesNet you wil not be able to connect to your cloud VMs or clone repos from GitHub via SSH. Use SaxaNet. #
Follow these instructions step-by-step to setup your AWS EC2 environment. This EC2 instance will serve as your development environment throughout the semester, allowing you to run Python code and Jupyter notebooks using VSCode as your IDE.
You can click on any screenshot on this page to see its full size version.
Task 1: Accept the Invitation to join the AWS Classroom (the first time)
In your student email inbox, you will have an email from AWS Academy with the subject Course Invitation.
Open the email. Click on the Get Started button as shown in the screenshot below. #
Click on the Create My Account button to create a new Canvas Account (note that this canvas account is different from your existing Georgetown canvas account). #
You should now be logged into AWS Academy Learners Lab and seeing a screen like the one shown below. Click on Learners Lab #
Scroll all the way to the bottom of the page and accept the Terms & Conditions. #
You are now being logged in into the AWS Console. Notice the ⬤ adjacent to the word AWS towards the top left of the page. This indicates that the lab (i.e. the AWS resources) have not yet started. Click on the play button along side Start Lab on the to top right corner of the page to start the lab. #
#
Each lab session that you start is at most 4 hours and you can see a timer showing the remaining time (hh:mm) on the ribbon along with the remaining budget out of the $50 allocated to each account. #
This environment is long-lived. When the session timer runs to 0:00, the session will end, but any data and resources that you created in the AWS account will be retained. If you later launch a new session (for example, the next day), you will find that your work is still in the lab environment. Running EC2 instances will be stopped and then automatically restarted the next time you start a session.
Monitor your lab budget in the lab interface above. Whenever you have an active lab session, the latest known remaining budget information will display at the top of this screen. This data comes from AWS Budgets which typically updates every 8 to 12 hours. Therefore the remaining budget that you see may not reflect your most recent account activity. If you exceed your lab budget your lab account will be disabled and all progress and resources will be lost. Therefore, it is important for you to manage your spending.
Login into the AWS Console
The AWS Console is your entry point into the AWS cloud.
A new tab will open in your browser, this is the AWS Console. #
Note the URL in your browser’s address bar, it will start with the name of the AWS region (such as us-east-1) in which your cloud resources are hosted.
Note the username on the top right hand corner, this is your Federated Identity. Also note that the you did not have to provide any credentials (username/password) to login into the AWS console. How did this happen? #
Logging into the AWS Console at a later time
To access the AWS Console in the future, login to https://www.awsacademy.com/LMS_Login, go to Learner Lab -> Modules -> Start Lab.
If you already had an AWS account prior to logging into AWS Academy you would need to login into the AWS Educate AWS account via an Incognito Browser Window.
Task 2: Create an EC2 Instance for Development
Now we’ll create an EC2 instance that will serve as your development environment throughout the semester. This instance will run Ubuntu Linux and can be accessed remotely using VSCode.
Configure Your Instance
You’ll now be on the “Launch an instance” page. Follow these steps to configure your instance:
Configure Instance Details
Name and tags: Give your instance a meaningful name, such as
{your-net-id}-dsan6k-dev
(replace{your-net-id}
with your actual NET ID).Application and OS Images (Amazon Machine Image):
- Select Ubuntu
- Choose Ubuntu Server 24.04 LTS (HVM), SSD Volume Type (or the latest Ubuntu LTS version available)
- Architecture: 64-bit (x86)
Instance type:
- Select t3.large from the dropdown
- This provides 2 vCPUs and 8 GiB of memory, suitable for data science workloads
- As of September 2025 in us-east-1, this instance costs approximately $0.083 per hour (roughly $0.42 for 5 hours)
Key pair (login):
- Click on Create new key pair
- Key pair name:
{your-net-id}-dsan6k-f2025
(replace{your-net-id}
with your actual NET ID) - Key pair type: RSA
- Private key file format: Choose based on your operating system:
.pem
for Mac/Linux.ppk
for Windows (if using PuTTY)
- Click Create key pair and save the file securely - you’ll need this to connect to your instance
SAVE YOUR KEY PAIR FILE SECURELY! This file is your only way to access your EC2 instance. Store it in a safe location on your computer and never share it with anyone.
- Network settings:
- Leave all settings as default (this includes VPC, subnet, auto-assign public IP, and security group settings)
- The default settings will automatically enable public IP and allow SSH access
- Configure storage:
- Change the root volume size from 8 GiB to 100 GiB
- Volume type: gp3 (General Purpose SSD)
- This provides ample storage for datasets and software installations
- Advanced details (expand this section):
- IAM instance profile: Select LabInstanceProfile from the dropdown
- This gives your EC2 instance permissions to access other AWS services
- Leave all other settings as default
- Review and Launch:
- Review all your settings
- Click the Launch instance button
Wait for Instance to Initialize
After clicking “Launch instance”, you’ll see a success message. Click on View all instances to go back to the instances list.
Your new instance will appear in the list with a status of “Pending”. Wait a few minutes for it to change to “Running” and for the status checks to pass (2/2 checks passed).
Once the instance is running, note down the Public IPv4 DNS - you’ll need this to connect via VSCode.
The instance typically takes 1-2 minutes to fully initialize and become accessible.
Task 3: Create and Configure an S3 Bucket
Amazon S3 (Simple Storage Service) is AWS’s object storage service that allows you to store and retrieve data from anywhere on the web. In this task, you’ll create an S3 bucket to store datasets, code outputs, and other files for your projects.
View S3 Buckets Dashboard
Once in the S3 service, you’ll see the main S3 buckets dashboard. This shows all your existing buckets (if any) and allows you to create new ones.
Create a New S3 Bucket
- Click on the Create bucket button (orange button on the right side of the page).
- Configure your bucket:
- Bucket name: Enter a unique name like
{your-net-id}-dsan6k-f2025
(replace{your-net-id}
with your actual NET ID)- Note: Bucket names must be globally unique across all AWS accounts, lowercase, and can contain only letters, numbers, and hyphens
- AWS Region: Leave as default (should match your EC2 region, typically us-east-1)
- Object Ownership: Leave as default (ACLs disabled)
- Block Public Access settings: Leave all boxes checked (block all public access) for security
- Bucket Versioning: Leave as “Disable” for now
- Tags: Optional - you can add tags like
course: dsan6000
if desired - Default encryption: Leave as default (Server-side encryption with Amazon S3 managed keys)
- Advanced settings: Leave as default
- Bucket name: Enter a unique name like
- Scroll to the bottom and click Create bucket.
View Your Created Bucket
After creation, you’ll be redirected to the S3 buckets list where you can see your newly created bucket.
Upload a Test File to Your Bucket
Click on your bucket name to open it.
Click the Upload button.
Either drag and drop files or click Add files to browse and select files from your computer.
For testing, you can create a simple text file on your computer with some content and upload it.
After selecting your file(s), click Upload at the bottom of the page.
Once uploaded, you’ll see a success message. Click Close to return to your bucket view.
Your S3 bucket is now ready to use! You can upload datasets, store output files, and share data between different AWS services. The bucket name you created will be used in subsequent labs and assignments.
Working with S3 from the Command Line
Once you connect to your EC2 instance via VSCode (in the next section), you can interact with your S3 bucket using the AWS CLI, which is pre-installed on EC2 instances with the LabInstanceProfile. Here are some useful commands you’ll use:
# List all your S3 buckets
aws s3 ls
# List contents of a specific bucket
aws s3 ls s3://your-bucket-name/
# Upload a file to S3
aws s3 cp local-file.txt s3://your-bucket-name/
# Download a file from S3
aws s3 cp s3://your-bucket-name/file.txt ./
# Upload an entire directory
aws s3 sync ./local-folder s3://your-bucket-name/folder/
# Download an entire directory
aws s3 sync s3://your-bucket-name/folder/ ./local-folder
Remember to replace your-bucket-name
with the actual name of the bucket you created (e.g., {your-net-id}-dsan6k-f2025
).
Task 4: Connect to EC2 Instance using VSCode
Now that your EC2 instance is running, we’ll set up VSCode to connect to it remotely. This will allow you to write code, run Python scripts, and work with Jupyter notebooks directly on your EC2 instance using the familiar VSCode interface.
Prerequisites
Install Visual Studio Code on your local machine if you haven’t already: Download VSCode
Install the Remote - SSH extension in VSCode:
- Open VSCode
- Click on the Extensions icon in the sidebar (or press
Ctrl+Shift+X
) - Search for “Remote - SSH”
- Install the extension by Microsoft
Video Tutorial
For a detailed walkthrough of connecting VSCode to an EC2 instance, watch this helpful video tutorial:
Alternatively, you can watch on YouTube directly.
Step-by-Step Connection Instructions
Configure SSH Connection
Set up your SSH key permissions (Mac/Linux only):
chmod 400 ~/path/to/your-key.pem
Replace
~/path/to/your-key.pem
with the actual path to your downloaded key file.Open VSCode and press
F1
orCtrl+Shift+P
(Windows/Linux) orCmd+Shift+P
(Mac) to open the command palette.Type “Remote-SSH: Open SSH Configuration File” and select it.
Choose the configuration file to edit (usually
~/.ssh/config
on Mac/Linux orC:\Users\YourUsername\.ssh\config
on Windows).Add the following configuration to the file:
Host dsan6000-ec2 HostName YOUR_EC2_PUBLIC_DNS User ubuntu IdentityFile ~/path/to/your-key.pem
Replace:
YOUR_EC2_PUBLIC_DNS
with your instance’s public IPv4 DNS~/path/to/your-key.pem
with the actual path to your key file
Save the configuration file.
Connect to Your Instance
Press
F1
or open the command palette again.Type “Remote-SSH: Connect to Host” and select it.
Select dsan6000-ec2 from the list (or whatever name you gave your host).
VSCode will open a new window and connect to your EC2 instance.
If prompted about the platform, select Linux.
If this is your first connection, you’ll be asked to verify the authenticity of the host. Select Continue.
Every time your AWS session ends (typically after 4 hours) and you restart it:
- Your EC2 instance will automatically restart (give it ~2 minutes to become fully operational)
- The instance will have a NEW public IPv4 DNS/domain name
- You MUST update the HostName in your VSCode SSH config file (
~/.ssh/config
) with the new DNS - Your data is safe! The storage is persistent - all your files, installed software, and work remain intact
- Only the compute infrastructure restarts with a new address
What this means for you: 1. After starting a new AWS session, wait 2 minutes for EC2 to fully start 2. Check the new public IPv4 DNS in the EC2 console 3. Update your VSCode SSH config with the new DNS 4. Reconnect VSCode to your instance
Remember: The old hostname is gone, but your content persists!
Initial Setup on EC2
Once connected, open a terminal in VSCode (Terminal
→ New Terminal
) and run these commands to set up your development environment:
# Update package list
sudo apt update
# Install Python and essential tools
sudo apt install -y python3-pip python3-venv git
# Install Jupyter
pip3 install jupyter notebook jupyterlab
# Create a working directory
mkdir ~/dsan6000
cd ~/dsan6000
# Test S3 access with your bucket
# Replace 'your-net-id' with your actual NET ID
aws s3 ls
aws s3 ls s3://your-net-id-dsan6k-f2025/
# Create a test file and upload it to S3
echo "Hello from EC2!" > test.txt
aws s3 cp test.txt s3://your-net-id-dsan6k-f2025/
# Download the file back from S3
aws s3 cp s3://your-net-id-dsan6k-f2025/test.txt downloaded-test.txt
cat downloaded-test.txt
You can now use VSCode on your local machine to edit files, run Python scripts, and work with Jupyter notebooks directly on your EC2 instance. The Remote-SSH extension makes it feel like you’re working locally, but all computation happens on your EC2 instance.
Additionally, your EC2 instance has AWS CLI pre-configured with the LabInstanceProfile, allowing you to seamlessly interact with S3 and other AWS services directly from the terminal.
Task 5: Complete Today’s Lab Assignment
Now that you have your EC2 instance set up with VSCode, let’s complete today’s lab assignment.
Accept the GitHub Classroom Assignment
Click on the following link to accept the lab assignment: GitHub Classroom Assignment - Lab 02
Accept the assignment, which will create a private repository for you.
Clone the Repository
In your VSCode terminal (connected to EC2), clone your assignment repository:
cd ~/dsan6000 git clone YOUR_ASSIGNMENT_REPO_URL cd YOUR_REPO_NAME
Review the Assignment
The assignment is based on analyzing a dataset using cloud computing resources. You can review the analytics.ipynb
notebook in your repo.
Complete the Tasks
Open the
analytics.ipynb
notebook in VSCode:- Click on the notebook file in the VSCode explorer
- VSCode will automatically open it with Jupyter support
The notebook contains several tasks related to:
- Loading and exploring a dataset
- Creating an S3 bucket in your AWS account
- Uploading data to S3
- Performing basic data analysis
Work through each cell in the notebook, completing the tasks marked with “TODO” or “Task”.
Make sure to:
- Run all cells and verify the output
- Complete all required tasks
- Add your analysis and observations where requested
Submit Your Work
After completing all tasks, save your notebook.
Commit and push your changes to GitHub:
git add . git commit -m "Complete Lab 02 - EC2 and data analysis" git push origin main
Make sure to commit and push your completed notebook to your GitHub repository before the deadline.
Important: Shutting Down Your Resources
To avoid unnecessary charges to your AWS account, it’s crucial to properly shut down your resources when not in use.
Stop Your EC2 Instance
When you’re done working:
- Go to the EC2 Dashboard in AWS Console
- Select your instance
- Click Instance State → Stop instance
- Confirm the action
Stopping an instance preserves your data and allows you to restart it later. Terminating an instance permanently deletes it and all associated data.
End Your Lab Session
- Return to the AWS Academy Learner Lab page
- Click the End Lab button
- This will stop the billing timer for your session
At the end of each work session:
- Stop your EC2 instance (not terminate - stop preserves your work)
- End the lab in AWS Academy to stop the billing timer
- Remember you have a $50 budget for the entire semester - use it wisely!
Budget Management Tips: - A stopped instance costs only for storage (minimal) - A running t3.xlarge instance costs approximately $0.22 per hour - Always stop instances when not actively using them - Set up billing alerts in AWS to monitor your spending
Restarting Your Work
When you want to continue working:
- Start a new lab session in AWS Academy
- Go to EC2 Dashboard
- Select your instance and click Instance State → Start instance
- Wait for it to enter “Running” state
- Connect via VSCode as before (the DNS might change - check the new public IPv4 DNS)
Your data and installed software persist on the instance between sessions, so you can pick up right where you left off!
GitHub Classroom Link
Complete today’s assignment here: https://classroom.github.com/a/LzwXGbm_