Skip to content

tnh19002/tyler-hinrichs-ucsas-2024

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 

THIS REPOSITORY HAS MOVED

Please visit the updated location of this repository for all future updates: https://github.com/tylernh10/tyler-hinrichs-ucsas-2024

tyler-hinrichs-ucsas-2024

If you want to start analyzing sports, you need data. Nowaways, there are many sources of pre-built datasets, but at times, you might have a need to make a custom dataset with data found online. Web scraping is the most effective solution to this problem. You can create automated scripts that can quickly and efficiently gather data from webpages. In doing so, you can create datasets specific to the questions that you want to be answered. During this workshop you will learn 1) what web scraping is, 2) how static web scraping works using Python packages pandas, requests, and BeautifulSoup, then 3) how dynamic web scraping works using Python package Selenium in conjunction with the previously learned packages.

Important information about each notebook:

static_soccer_data.ipynb:

  • We use 3 Python libraries, Requests, BeautifulSoup4, and Pandas, which can be installed with commands in the notebook

dynamic_soccer_data.ipynb:

Slides:

  • Have been created using rmarkdown
  • Access through the .rmd file (need R to run) or through the html file

About

Materials for UCSAS 2024

Resources

Stars

Watchers

Forks

Releases

No releases published