Skip to main content Link Search Menu Expand Document (external link)

Final Part 1

Instructions

Wait until the discussion section is held, to be placed in a group. Your TA will randomly assign you to a group. As groups will still be somewhat in flux during the first 2 weeks of class (late adds/drops) make sure there is some flexibility in your planning. Reach out to your TA or IA early if there are any issues with group members dropping. Sometimes a group member will not tell their group they dropped the class and just go into silent mode. If you are unsure, ask your TA and they can verify the student’s status.

Click the link above to open the Google Doc for part one. You can not edit this document, you must create a duplicate of it in your own Google Drive. Your group will work out of this document. When ready to submit, save/export the document as a PDF, and turn it in as one submission (per group) in Gradescope. You may resubmit the document as many times as you like before the deadline.

The first part was meant to get you and your group started on your final project. This part will require you to complete the first half of your final project. In this report, you included information in the following sections of what will become your final project:

  • Question
  • Hypothesis
  • Background Information
  • Data
  • Ethical Considerations

Do not make multiple submissions on Gradescope per group member. You submit the document once as a group, with all names/PIDs on the PDF.

Additional Resources

Frequently Asked Questions - FAQ

  1. How specific does our data science question need to be?

    Not very specific at this point. Coming up with a good data science question is very tough. It is typical to begin a project with a more relaxed, general question. As you perform analysis, seek more data, and talk with domain experts/stakeholders your questions tend to get more specific. The final project is a good place to take your Assignment 1 question and refine it to be more specific. For now, just think about what type of data and topics interest your group, and formulate a question from that limited information.

  2. Do we need to find actual data that fits our question?

    Good question! No, you do not. in-fact I would be surprised if any team finds a good data set with everything they need to perform such an analysis. What we are looking for is that you sought out a data set with data that is close enough to your topic/question. If there are not enough observations, or missing fields/variables, that is ok! You can still use the data, just let us know what or how much is missing. E.g. A data question could be studying the association between height and GPA, but perhaps your dataset does not include GPA, but avg. test score per class. This is ok, just let us know what data you wish you had and what that data would be like.

  3. How much background info do we need?

    We can’t all be domain experts, nor may you always be able to find one. Think about logical/critical thinking. Do your hypothesis and data science questions flow together? How do you justify your hypothesis and position, what limited background info have you read about? Talk a bit about these decisions in your project.