The emergence of data science as a discipline has impacted businesses in a range of different ways. One primary impact has been to elevate the use of data in decision-making by using statistical methods to assess the ever-growing datasets companies are collecting. This workshop will review and introduce statistical techniques and touch on more advanced methods for dealing with noisy data and applying real-world constraints to analyses. This workshop assumes a working knowledge of standard statistical methods and will aim to connect theory to practice using real-world examples.

Lesson 1: Descriptive statistics and exploring data statistically

- (Re)familiarize yourself with basic descriptive statistics
- Use simple data exploration techniques to identify problems and limitations of a new dataset

Lesson 2: Statistical analyses

- Review of statistical tests to compare datasets and groups within those data
- Assessments of correlations and other qualities of the data with an eye towards modeling

Lesson 3: More advanced analyses and methods

- Linear modeling and the statistical outputs thereof
- Stats -> ML: connections and methodologies


New on-demand courses are added weekly

Training Overview

  • 01

    ODSC East 2020: Statistics for Data Science

    • Training Overview and Author Bio

    • Before you get started: Prerequisites and Resources

    • Statistics for Data Science

Instructor Bio:

Andrew Zirm

Andrew is a Ph.D. Astrophysicist who made the switch from academia to data science (via the Insight Data Science program) in 2014. He was the first data scientist hired at Greenhouse Software where he has worked on many internal data science projects and a few customer-facing data-powered product features. Andrew lives in New Jersey with his wife and son.

Andrew Zirm, PhD

Senior Data Scientist | Greenhouse Software