Data Analysis

Machine Learning for Big Data / Statistics for Big Data

Course number: 
CSE 599 C1 / STAT 592

T/Th 10:30-11:50, EEB 045

Today, data analysis methods in machine learning and statistics play a central role in industry and science. The growth of the Web and improvements in data collection technology in science have lead to a rapid increase in the magnitude and complexity of these analysis tasks. This growth is driving the need for scalable, parallel and online algorithms and models that can handle this "Big Data". This course will provide a broad foundation for this timely challenge.

Certificate in Data Science

Develop the computer science, mathematics and analytical skills in the context of practical application needed to enter the field of data science. Discover how to use data science techniques to analyze and extract meaning from extremely large data sets, or “big data.” Become familiar with modern database systems, data models, and query interfaces. Learn how to use statistics, machine learning, text retrieval and natural language processing to analyze data and interpret results. Practice using these tools and techniques on data sets of increasing complexity and scale.

Introduction to Resampling Inference

Course number: 
STAT 403

Introduction to computer-intensive data analysis for experimental and observational studies in empirical sciences. Students design, program, carry out, and report applications of bootstrap resampling, rerandomization, and subsampling of cases.

Probability and Statistics for Computer Science

Course number: 
STAT 391

Fundamentals of probability and statistics from the perspective of the computer scientist. Random variables, distributions and densities, conditional probability, independence. Maximum likelihood, density estimation, Markov chains, classification. Applications in computer science.

Statistical Software and Its Applications

Course number: 
STAT 302

Introduction to data structures and basics of implementing procedures in statistical computing packages, selected from but not limited to R, SAS, STATA, MATLAB, SPSS, and Minitab. Provides a foundation in computation components of data analysis.

Syndicate content