Vor der Buchung beachten:

Bitte beachten Sie vor Ihrer Kursanmeldung unsere Allgemeine Teilnahmebedingungen (pdf, 38 KB) insbesondere aber unser Fairplay: An- und Abmelden (pdf, 299 KB).. Vielen Dank!


Hier finden Sie Antworten zu häufig gestellten Fragen.


Zum Abmelden von Kursen gehen Sie bitte auf Ihre Teilnehmenden-Homepage .



Kontakt:

kurssekretariat@id.uzh.ch


R: Web Scraping

Collecting and preprocessing data is always the first step in a data analysis project or in a machine learning pipeline. The web plays a crucial role here: Often, authoritative statistical data are published as tables on regularly updated websites. Data found on social networks might provide valuable ground truth for training machine learning algorithms. However, gathering data from websites is often not that straightforward and requires an understanding of the architecture of the web.

In this course, you'll learn how to leverage R to collect and parse data found on various kinds of websites. By doing so, you'll get to know typical website architectures and how to approach them efficiently for scraping. The first part of the course will be held remotely and will introduce various concepts and R functions, while the second part will be held on site, where you'll be faced with some hands-on scraping challenges.

Learning Content

  • Overview of approaches for collecting data from a remote source
  • Introduction of different R packages for scraping (httr and rvest)
  • How to parse tabular data on websites into R data frames
  • Scraping best practices
  • Where to go from here & approaches for more complicated websites

Voraussetzungen

Either some basic knowledge of R (ideally with the Tidyverse) or completion of the following courses.

Kurs(e)

ARE - R: Basic IntroductionARF - R: tidyverse for Data Science

Teilnehmende

Students and employees of the University of Zurich. This course is particularly suitable for students at the MSc-/PhD-Level as well as other academic personnel such as postdocs.

Kursunterlagen

Handouts will be distributed during the course.

Durchführung

Kurs ARW 1
Freie Plätze:4
Dauer:3 Tag(e) / 9 Stunde(n)
Kursleitende:Timo Grossenbacher
Teilnehmerzahl:Min: 7
Max: 24
Ort:Anmeldung auf der Kollaborationsplattform Teams mit Ihrem UZH Account. Im Team "ZI - IT Fort- und Weiterbildungen" finden sie den Kanal Y10-E-25, ihren virtuellen Kursraum für diesen Kurs.
Y10-E-25 (on Teams)
Datum/Zeit:
Montag, 7. Juni 202117:00 - 20:00
Mittwoch, 9. Juni 202117:00 - 20:00
Freitag, 11. Juni 202117:00 - 20:00
Veranstaltungs-Infos als ICS Feed