Summary and Schedule
This is a new lesson built with The Carpentries Workbench.
Setup Instructions | Download files required for the lesson | |
Duration: 00h 00m | 1. Introducing Python |
What is Python? How do we assign variables in Python? How do we perform actions in Python? How does Python handle different types of data? |
Duration: 01h 00m | 2. Introducing pandas |
What data will we be working with in this lesson? What is pandas? Why use pandas for data analysis? How do we read and write data using pandas? |
Duration: 02h 00m | 3. Accessing Data in a Dataframe |
How can we look at individual rows and columns in a dataframe? How can we perform calculations? How can we modify the table and data? |
Duration: 03h 00m | 4. Aggregating and Grouping Data |
How do we calculate summary statistics? How do we group data? How do null values affect calculations? |
Duration: 04h 00m | 5. Combining Dataframes |
How do we combine data from multiple sources using pandas? How do we add data to an existing dataframe? How do we split and combine data columns? |
Duration: 05h 00m | 6. Data Workflows and Automation |
Can I automate operations in Python? What are functions and why should I use them? |
Duration: 06h 30m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
This lesson uses JupyterLab and pandas to analyze and visualize data from an ecology dataset. It was adapted (and currently includes huge chunks of text lifted directly from) Data Management with SQL for Ecologists and covers similar material to Data Analysis and Visualization in Python for Ecologists. The idea was to create parallel lessons for SQL and Python for use by the Smithsonian Carpentries.
This lesson is designed to use JupyterLab, an interactive development environment widely used for doing data science with Python. The rest of this page provides instructions for setting up the software and data needed to complete this lesson. Please complete the setup prior to your scheduled lesson.
Setup
We will use a program called Miniconda to set up JupyterLab, so first we need to download and install Miniconda (64 bit). We recommend using the following installers:
- Windows: Miniconda3 Windows 64-bit
-
macOS:
- Apple M1: Miniconda3 macOS Apple M1 64-bit pkg
- Intel: Miniconda3 macOS Intel x86 64-bit pkg
Using the command-line interface
A command-line interface, or CLI, is an application that can run commands supplied as text. Examples of command-line interfaces include the Windows command prompt and Unix shells, including bash. We’ll be using the CLI to install and run JupyterLab.
Each operating system has one or more command-line interfaces available. We recommend using the following applications for this lesson:
- Windows: Use the Anaconda Prompt, which was installed as part of Miniconda. You can find it by searching for Anaconda Prompt in the search box on the Windows toolbar.
- macOS: Use the Terminal. You can find it in the Applications/Utilities folder or by searching for Terminal using Spotlight.
The commands given below may not work if a different application is used, so we’d strongly encourage you to use the recommended ones.
Running commands
To run any of the commands presented here, copy-paste them into the CLI, then press enter to run the command.
Installing JupyterLab
Once the command-line interface is open, run the following commands to install the software needed for this course:
SH
conda create --name python-ecology-lesson
conda activate python-ecology-lesson
conda install --channel conda-forge --yes altair jupyterlab pandas
Downloading data to the lesson folder
Next we’ll create the lesson folder and download the data needed for the lesson:
- Create a folder called python-ecology-lesson on your desktop
- Create a folder called data inside the folder created in step 1
- Download the following files into the data folder:
- surveys.csv: https://figshare.com/ndownloader/files/10717177
- species.csv: https://figshare.com/ndownloader/files/3299483
Alternatively, we can run the following commands to create the lesson folders and download the necessary data:
SH
cd ~/Desktop
mkdir python-ecology-lesson/data
cd python-ecology-lesson/data
wget -O surveys.csv https://figshare.com/ndownloader/files/10717177
wget -O species.csv https://figshare.com/ndownloader/files/3299483
Note for Windows users
The commands used here assume that your desktop is in a standard location. If you are using a Windows computer with OneDrive enabled, your desktop may be in a different place. You can use the following command in place of cd ~/Desktop
in the instructions on this page to get to your desktop no matter where it is:
POWERSHELL
cd $([Environment]::GetFolderPath("Desktop"))
Running JupyterLab
Once JupyterLab has been installed, we can run it by opening the command-line interface and running the following commands:
SH
conda activate python-ecology-lesson
cd ~/Desktop
cd python-ecology-lesson
jupyter-lab
JupyterLab should now open in a new browser window.
Test your installation
Try running this command before the scheduled lesson – does JupyterLab appear as expected?