CMPS 445 Data Mining and Visualization
Sections 1 and 2 - Winter 2014
Instructor and Contact Information
Melissa Danforth
Office: Sci III 338, 654-3180
Office Hours: MWF 2:00-3:15pm and Th 1:00-2:30pm
Email: melissa@cs.csubak.edu

Course website: http://www.cs.csubak.edu/~melissa/ under Teaching menu
Moodle website: http://moodle.cs.csubak.edu/moodle/course/view.php?id=79

Course meets MWF 3:30-4:40pm and Tu 3:15-5:45pm in Sci III 311

Catalog Description
CMPS 445 Data Mining and Visualization (5)
Knowledge discovery in and visualization of large datasets, including data warehouses and text-based information systems. Topics covered include data mining concepts, information retrieval, analysis methods, storage systems, visualization, implementation and applications. Prerequisite: CMPS 312
Prerequisites by Topic
Data structures
Algorithm analysis
Relations and sets
Graph theory
Units and Contact Time
5 quarter units. 4 units lecture (200 minutes), 1 unit lab (150 minutes).
Type
Selected elective for CS
Required Textbook
Data Mining: Concepts and Techniques, Third edition. Jiawei Han, Micheline Kamber, and Jian Pei. Morgan Kaufmann Publishers, 2012, ISBN-13 978-0-12-381479-1.
Recommended Textbook and Other Supplemental Materials
Author's website (textbook errata): http://www.cs.uiuc.edu/~hanj/bk3
Coordinator(s)
Melissa Danforth
Student Learning Outcomes
This course covers the following ACM/IEEE Body of Knowledge student learning outcomes:

CC-IS: Intelligent Systems
CC-IM: Information Management

ABET Outcome Coverage
The course maps to the following performance indicators for Computer Science (CAC/ABET):
(CAC PIa1): Apply and perform the correct mathematical analysis.
(CAC PIb1): Identify key components and algorithms necessary for a solution.
(CAC PIb2): Produce a solution within specifications.
(CAC PIe1): Recognize ethical issues involved in a professional setting.
Lecture Topics and Rough Schedule

Not in Book Ethics of Data Mining Weeks 1 and 2
Chapter 1 Introduction Week 1
Chapter 2 Getting to Know Your Data Weeks 1 and 2
Chapter 3 Data Preprocessing Weeks 2 and 3
Chapter 6 Mining Frequent Patterns: Basic Concepts Weeks 3 to 5
Chapter 7 Advanced Pattern Mining (selected topics) Week 5
Chapter 8 Classification: Basic Concepts Weeks 5 to 7
Chapter 9 Classification: Advanced Methods Weeks 7 and 8
Chapter 10 Cluster Analysis: Basic Concepts Weeks 8 and 9
Chapter 11 Advanced Cluster Analysis (selected topics) Weeks 9 and 10
Chapter 12 Outlier Detection (selected topics) Week 10
Chapter 4 Data Warehousing (selected topics) Week 10
Estimated ABET Category Content

Math and Basic Sciences ?? Credit Hours
Computing ?? Credit Hours ?? Fundamental, ??Advanced
Engineering Topics ?? Credit Hours
Engineering Design ?? Credit Hours
Design Content Description
Not applicable to this course.
Attendance
Students are responsible for their own attendance. The topics covered in lecture will be listed on the course website. Lab attendance is not required but is strongly encouraged.
Academic Integrity Policy
Assignments may discussed in groups. If the assignment is a group assignment, the group turns in one assignment for the entire group. However, if the assignment is an individual assignment, each student must turn in their own work; no direct copying is allowed. You may discuss individual assignments with other students, but you must write up the assignment in your own words. Any direct copying from other students, the textbook, Internet resources, etc. that the instructor detects will result in a grade of 0 for that assignment. Refer to the Academic Integrity policy in the campus catalog.
Computer Labs Outside of Class
The CEE/CS Tutoring Center in Sci III 324 is available for use by students in this course outside of class time on a first come, first serve basis. Priority in the lab is given to students who are completing assignments for CEE/CS courses. See the schedule on the door for hours the lab will be open.

There are also computers available in the CEE/CS Major Study Lounge in Sci III 341 (formerly the CEE/CS Library). This room is only open when faculty members are on campus, e.g. approximately 8am to 5pm on weekdays. If the door is currently locked, see Steve, Lori, myself, or another faculty member to unlock it.

Grading
Labs/Homework 25%
Midterm 25%
Project 25%
Final 25%
Homework/Lab Policies
Labs will be group assignments. Each group, consisting of 1-3 students, will turn in one assignment for the entire group. Be sure all names are on the assignment so all group members receive credit.

Homeworks may be discussed in groups, but every student must turn in their own assignments in their own words. Refer to the Academic Integrity Policy above.

Assignments will be posted online on the course website. The due date will be given with the assignment.

Late Policy
Late labs are not accepted, however partial credit will be given for incomplete labs. The lowest lab grade will not count towards your overall Lab/Homework total.

Late homework is accepted, but it will be marked down 10% for every day it is late. Saturday and Sunday combined count as only one day late (e.g. if the assignment is due Friday and you turn it in Sunday, it will be marked as one day late). If there is a late policy stating the last day the assignment can be turned in late posted on the assignment, then that policy will apply for that particular assignment. Otherwise, homework assignments that are more than three days late will not be accepted.

Homework/Lab Submission
Assignments are submitted by emailing the instructor from the CEE/CS department server (coding assignments) or by attaching files to the Moodle website (non-coding assignments). All files must be in text, OpenOffice or PDF format. If scanning a hand-written page, use a standard image format such as JPG, PNG, or GIF, or use PS or PDF format.

Do NOT use Microsoft Office formats, particularly DOCX or XLSX, as they cannot be read by the instructor on all grading machines. Do not use GMail or any other email method for the coding assignments as the campus spam filter may block the email without notifying the either you or the instructor.

Allow at least one week after the assignment due date for the grade to be posted to Moodle. All coding assignments will have the Comment section of Moodle updated to say "assignment received" within a day or two of the instructor receiving the email, even if the assignment has not yet been graded.

It is your responsibility to check Moodle to see if your assignment has been received. If you believe you emailed the assignment on time but the instructor has not received it, contact the instructor.

Project
All students will be required to complete a data mining project during the course of this class. Students may work on teams for the project. Projects may be either a stand-alone data mining project or participation in a data mining competition at Kaggle.com.

Each project must have a proposal which lists the nature of the project, the team members, any previous work any team member has done on the project, and a brief list of tools that will be needed for the project. Project proposals are due by the end of the 3rd week of class and will count for a portion of the Project grade.

At the end of the quarter, a project writeup will be required. Requirements for the writeup will be posted on Moodle and discussed in class. The writeup will count for the remaining portion of the Project grade.

Midterm
The midterm will be on Tuesday February 4th during the lab time

If you cannot make the midterm due to class conflicts, you can schedule an alternate time by contacting the instructor at least ONE WEEK in advance.

A make-up midterm will only be given if you have to miss the midterm for serious and compelling reasons. You must notify the instructor of the reason for missing the midterm as soon as possible after missing the midterm.

Final
Wednesday March 19, 2014 from 5:00pm to 7:30pm in Sci III 311

If you cannot make the scheduled final time because it conflicts with another final or you have more than two finals scheduled that day, you MUST contact the instructor ONE WEEK in advance of the final to schedule an alternate time.

If sufficient requests are made for an alternate time, the instructor reserves the option to set up a second final session at an alternate time for the students who cannot make the regular final session time. If the instructor opts to schedule a second final session, it will be announced on Moodle and in class.

Prepared By
Melissa Danforth on 31 December 2013
Approval
Approved by CEE/CS Department on [date]
Effective Winter 2014