Week 10
Wrangling Data in Python

Soci—269

Sakeef M. Karim
Amherst College

AN INTRODUCTION TO QUANTITATIVE SOCIOLOGY—CULTURE & POWER

Module III Begins–
November 3rd

Reminder

Coding Assignment in

Coding Assignment Deadline

Your first coding assignment is due by 8:00 PM on Wednesday.

Reminder

Coding Assignment in

Assignment instructions are available online.

Update

Coding Assignment in Python

Instructions for your second assignment are available online, too.

From pandas to polars
November 5th

Introduction to Python

Installing Python Locally

Download the Anaconda Distribution.

You can, of course, also download Python directly from its main website.

Launching Python in

We can use reticulate as a portal to Python from :

Show the underlying code
library(reticulate)

# Create new Anaconda directory featuring select packages:

conda_create("soci269")

# Moving forward, to use the conda environment created above, simply run: 

# use_condaenv("NAME OF ENV GOES HERE")

# use_condaenv("soci269")

# Add pandas, seaborn and matplotlib to your new Anaconda (conda) environment:

conda_install("soci269", c("pandas", "seaborn", "matplotlib"))

# GENERATING PLOTS VIA SEABORN --------------------------------------------

sns <- import("seaborn")

plt <- import("matplotlib.pyplot")

# Let's generate a simple plot via seaborn:

sns$set_theme()

sns$scatterplot(x = "bill_depth_mm",
                y = "bill_length_mm",
                hue = "species",
                data = palmerpenguins::penguins)

plt$show()

Positron May Be the Future

Download the Positron IDE.

Warning

Positron is still in its infancy.

Using Jupyter Notebooks

Interactive .ipynb Files

Jupyter Notebooks are living, interactive (.ipynb) documents. They allow users to craft a narrative, edit and execute lines of Python or code in real time, and generate a
wide range of outputs.

Using Colab

We’ll be using .ipynb files and Colaboratory for  Module III.

Using Colab

Note

The rest of today’s session will take place in Colab!

From pandas to polars
November 5th

A Reminder

Coding Assignment in Python

Instructions for your second assignment are available online, too.

Final Reminder

Coding Assignment in

Coding Assignment Deadline

Your first coding assignment is due by 8:00 PM tonight.

Back to Jupyter

Launch Colab

Note

The rest of today’s session will, once again, take place in Colab.

See You Monday

Reference(s)

McKinney, Wes. 2022. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and Jupyter. 3rd Edition. Sebastopol, CA: O’Reilly.