This web page will serve as the syllabus for the course. Please read it carefully. You should become familiar with these policies. To do so, you will likely need to return to the syllabus several times throughout the semester. After the start of the semester, this document may continue to be updated. Any such changes will be announced.
For simplicity, the course staff will exclusively refer to the course as STAT 432.
This Fall 2020 version of the course is online.
Please refer to the course staff by their given names. For example, your instructor is named David. If you refer to the staff as “Professor” or “TA,” we might refer to you as “student,” which seems odd.
Teaching Assistants are PhD students from the Department of Statistics. Course Assistants are students who previously completed STAT 432. Course Associates are students who have completed at least one semester as a Course Assistant for STAT 432.
STAT 432 provides a broad overview of machine learning, through the eyes of a statistician. As a first course in machine learning, core ideas are stressed, and specific details are de-emphasized. After completing the course, students should be able to train and evaluate statistical models. While we will not discuss an exhaustive list of methods, given the framework developed throughout the course, students should feel comfortable exploring new methods and models on their own. Previous experience with R
programming is necessary for success in the course as students will be tested on their ability to use the methods discussed through the use of a statistical computing environment.
Tentative subjects include:
After this course, students are expected to be able to …
Note: These objectives are similar to the objectives for the Society of Actuaries Exam PA: Predictive Analytics. (See details in their linked syllabus.) While STAT 432 was not specifically designed to prepare students the the SOA Exam PA, the coverage may be sufficient to sit for the exam, although some additional exam-specific study may be required.
The main text for the course will be BSL. Within BSL, readings from ISL may be assigned. If BSL and ISL provide conflicting guidance, we will defer to BSL in this course. When reading ISL, you do not need to read the sections decided to R
. We will follow the R
conventions only from BSL.
A course which covers linear regression that uses R
, such as STAT 420 or STAT 425. Basic knowledge of probability and linear algebra is also assumed. A working knowledge of the material from the following three texts would also be sufficient.
R
R
R
for Data Science
We will use several forms of communication for this course. The website will be the one-stop-shop for all course information. Compass will be used to send announcements which will also be sent via email.
If you would like to communicate with the course staff, our preferred methods of communication, in order, are:
For Fall 2020, all office hours will be held online via Zoom.
Staff and Link | Day | Time |
---|---|---|
Zoom with Matthew | Wednesday | 4:00 PM - 5:00 PM |
Zoom with Jonas | Wednesday | 5:00 PM - 7:00 PM |
Zoom with David | Wednesday | 7:00 PM - 8:00 PM |
Zoom with Jingyu | Wednesday | 8:00 PM - 10:00 PM |
Zoom with Jasmine | Thursday | 4:00 PM - 5:00 PM |
Zoom with David | Thursday | 7:00 PM - 8:00 PM |
Zoom with Tianyi | Thursday | 8:00 PM - 10:00 PM |
The office hour schedule is always subject to change. As such, the dates and times will be posted each week along with the course materials.
Office hours are by far our preferred forum for discussing individual specific questions. In office hours, our response time will be literally instant. Also, since we are both present in the same physical location (or together on Zoom), follow-up is both expected, and easy. Using textual forms of communication such as Piazza or email will have a slow response rate and a much lower communication bandwidth. In other words, please come to office hours!
If you would like to schedule a private meeting outside of regular office hours, please send an email suggesting two possible times, on two different days. (A total of four suggested times.) We have a preference for time-slots directly adjacent to current office hours. Please also indicate a brief agenda for the meeting. Requests to schedule a meeting at a time less than 24 hours in the future are unlikely to be granted.
This course will use Piazza for some course communications.
IREADTHESYLLABUS
Please register your account with your University email.
The course staff will attempt to check Piazza at least once a day, thus you can often expect a response within 24 hours, except for weekends. If you need a quicker response, you should consider office hours as an alternative.
The course staff would strongly prefer the use of Piazza to GroupMe or similar services not officially supported by the course. The course staff feels that a GroupMe may exclude members of the course, whereas all are welcome on Piazza.
Private posts have been disabled. Any private matters should be discussed over email where your identity is known and private. Some anonymous posting is disabled. (You may post anonymously to your classmates, but not the course staff.)
Additional Piazza policy can be found in a pinned post on Piazza.
Due to the large size of this course, we follow a strict email policy. Instead of email, consider Piazza! Any quick, non-private communication should take place there.
If you’d like to email the instructor or course staff, consider the following:
If you choose to send an email, you must adhere to the following three rules. If you do not, your email will be considered less import than other emails which follow the rules and response time will be slower.
@illinois.edu
email address or appear as sent on behalf or an @illinois.edu
address.
## good
[STAT 432] Grade feedback question
## bad
## improper format
## non-descriptive subject
[stat432] hi
## bad
## improper format
[STAT432] Grade feedback question
## bad
## improper format
## subject too long
## information found in syllabus or website
[STAT 432]when is the first CBTF exam and what is covered on the exam?
If your email is sent between 9:00 AM Monday and 11:59 PM Thursday, and you follow the above directions, we will try our best to respond within 24 hours. Questions about an assessment sent the same day the assessment is due will likely not receive a response before the assessment is due. Plan accordingly.
Role | Name | |
---|---|---|
Instructor | David Dalpiaz | dalpiaz2@illinois.edu |
Teaching Assistant | Tianyi Qu | tianyiq3@illinois.edu |
Course Associate | Jingyu Li | jli173@illinois.edu |
Course Assistant | Matthew Lezak | mlezak2@illinois.edu |
Course Assistant | Jonas Reger | wreger2@illinois.edu |
Course Assistant | Soham Saha | sohams2@illinois.edu |
Course Assistant | Jasmine Yi | fangyun2@illinois.edu |
If your question is technical in nature, there are several steps you can take to insure a speedy response on Piazza or in email.
First and foremost, you should ask Google before you ask the course staff. Take the error message you obtained and search it with Google. The ability to solve problems this way is an extremely value skill, possibly one of the most important you should learn (but are not taught) during your academic career. Make a legitimate effort to solve the problem on your own. You won’t always be able to, and if you can’t, post on Piaza. (Or better yet, stop by office hours.)
If you need to ask the course staff, include the following in your Piazza post or email:
Do not use screenshots of code and error messages to communicate about them. Copy paste them so that others can copy-paste them as well.
In this course, for everything expect exams, we greatly prefer over-sharing to under-sharing code. We would rather everyone learn from others “mistakes” than have everyone experience the same issues over and over again.
With the exception of exams, all course assignments are due at 11:59 PM, Central (Champaign) time, on the listed due date.
Throughout the semester, quizzes will be administered through the PrairieLearn system. (9 for undergraduates, 10 for graduate students.) These will be low-stakes, unlimited attempt quizzes. That is, there is no penalty for submitting incorrect answers, and your score can only go up, never down. These quizzes will serve as practice for exams. No quizzes will be dropped. Instead, there will be opportunity to earn buffer points with each quiz. Buffer points will allow you to obtain over 100% for a particular assignment, but your percentage on quizzes overall cannot exceed 100%.
The buffer point and late submission details can be seen in the details of each quiz on PrairieLearn. As an example, consider Quiz 01:
To obtain the 105% credit, you must achieve a score of 100% before the “due” date for 105% credit. (The “due” dates, we will generally refer to the date to obtain 105% credit.)
Quizzes and exams will both use the PrairieLearn system. Use the link below to sign-up and add STAT 432.
There will be one midterm exam proctored using the CBTF Online. Details about the exam can be found on the Exams page.
There will be four data analyses (DA) throughout the semester. Specific policies and directions can be found on the Analyses page.
Except for the exam, all deadlines are at 11:59 PM, Champaign time, on the listed day.
Assessment | Deadline |
---|---|
Quiz 01 | Friday, August 28 |
Quiz 02 | Friday, September 4 |
Quiz 03 | Friday, September 11 |
Quiz 04 | Friday, September 18 |
Quiz 05 | Friday, September 25 |
Exam | Monday, October 5 |
Quiz 06 | Friday, October 9 |
Quiz 07 | Friday, October 16 |
Quiz 08 | Friday, October 23 |
Quiz 09 | Friday, October 30 |
Analysis 01 | Wednesday, November 11 |
Analysis 02 | Wednesday, November 18 |
Analysis 03 | Wednesday, December 2 |
Analysis 04 | Wednesday, December 9 |
Grad Quiz | Wednesday, December 9 |
R
and RStudio are required software for this course. You will need access to a computer where you have the ability to install and update this software.
R
is a freely available language and environment for statistical computing and graphics.
R
.
It is your responsibility to make sure you are using the most recent version of both R
and RStudio. Failure to use the most recent version of R
will result in an inability to complete the quizzes.
Compass will be used to distribute grades and for assignment submissions.
Assessment | Percentage |
---|---|
Quiz | 50 |
Exam | 25 |
Analysis | 25 |
The quiz sub-score will be the average of the 9 quizzes for undergraduates. (It will be the average of 10 quizzes for graduate students.) If your quiz sub-scores is above 100 as a result of buffer points, it will be recorded as 100. Similarly, the sub-score for the analyses will be the average of the individual analyses.
A | B | C | D | |
---|---|---|---|---|
Plus | 99 | 87 | 77 | 67 |
Neutral | 93 | 83 | 73 | 63 |
Minus | 90 | 80 | 70 | 60 |
The instructor reserves the right to lower, but not raise, grade cutoffs. However, this policy should not create an expectation that this will happen. Asking for a change in cutoffs will make any change in cutoffs less likely.
Grading in the course is not competitive. There is nothing (other than some statistical realities) that would prevent the entire class from receiving a grade of A.
If you feel an assignment was graded incorrectly, you have one week from the date you received a grade to discuss it with the instructor. After one week, grading is final except for exceptional circumstances. You may not simply ask for a re-grade, but instead must justify to the instructor why the grading was done incorrectly. By disputing any grading, you agree to allow the instructor to review the entire assessment in question for other errors missed during grading. Requests must be sent via email. (Failure to follow the email policy will result in your request being denied.) Grade disputes over trivial points will likely be met with frustration. (A grade on a single assignment is not reflective of your overall grade in the course. The generous buffer points should more than make up for a single point deduction on a single assignment.)
All grade disputes must be discussed with the course instructor. Teaching Assistants and Course Assistants do not have authority to modify grades.
The official University of Illinois policy related to academic integrity can be found in Article 1, Part 4 of the Student Code. Section 1-402 in particular outlines behavior which is considered an infraction of academic integrity. These sections of the Student Code will be upheld in this course. Any violations will be dealt with in a swift, fair, and strict manner. In short, do not cheat, it is not worth the risk. You are more likely to get caught than you believe. If you think you may be operating in a gray area, you most likely are.
Policies about specific assessment types will be released with directions for those assessments. Two heuristics to keep in mind:
Under no circumstances should course materials be provided to Course Hero, Chegg, or any similar for-profit website. The course staff will seek the harshest possible academic integrity penalty for any students who do so.
The university values your safety. Please read this document or watch this video.
To obtain disability-related academic adjustments or auxiliary aids, students with disabilities must contact the course instructor and the Disability Resources and Educational Services (DRES) as soon as possible. To contact DRES, you may visit 1207 S. Oak St., Champaign, call 217-333-4603, e-mail disability@illinois.edu or go to the DRES website.
To ensure appropriate accommodation is provided in a timely manner, please provide your Letter of Accommodation during the first week of class. Letters received after a relevant assessment has been administered will likely cause logistical issues that could result in an inability to accommodate.
For some thoughts on teaching philosophy, some explanation of policies, and some general tips for success, please see The Extended Syllabus.
The instructor reserves the right to make any changes he considers academically advisable. Such changes, if any, will be announced. Please note that it is your responsibility to keep track of the course proceedings.