GGR274 Lab 11: Data analysis and result presentation

GGR274 Lab 11: Data analysis and result presentation#

Logistics#

Lab grade will be based on submission of this notebook to MarkUs during the lab session (or by 23:59 on Thursday).

You do not have to answer every question, but your notebook should be submitted as usual for attendance. Submit your completed file to MarkUs. Here are the instructions for submitting to MarkUs (same as last week):

  1. Download this file (Lab_11.ipynb) from JupyterHub. (See our JupyterHub Guide for detailed instructions.)

  2. Submit this file to MarkUs under the lab11 assignment. (See our MarkUs Guide for detailed instructions.)

Note: there’s no autograding set up for this week’s lab, but your TA will be checking that your submitted lab file is complete as part of your “lab attendance” grade.

Lab 11 Introduction#

Last week, we went over some questions you should ask yourself during the preliminary data exploration and analysis. This week, we will ask some questions that you may wish to consider during the analysis of your data, and how you may wish to visualize your results. Since your project is due very soon, it would be a good idea to start working on your project now if you have not started already.

What you write in this and the following labs are for your own reference. You may answer the questions in either words or code, but whenever possible, you should give the function you intend to use.

Let’s get started:#

What is the main analytical method that you will employ to analyze your data? What functions/packages will you need?

# Write your notes here using comments

Do you want to generate a confidence interval for your analysis? How could you accomplish this? (hint: refer to lecture on bootstrapping)

# Write your notes here using comments

What are some ways to check for the accuracy of your analysis? Which functions will you need? (hint: it may be useful to split your data into training and testing sets and think about overfitting/underfitting)

# Write your notes here using comments

How will you plan to visualize your data? (i.e. barplot, scatterplot etc.) Which functions will you use?

# Write your notes here using comments

Do you want to visualize your variables in a map? Which functions will you use?

# Write your notes here using comments

What do you think are some of the weaknesses associated with your method of analysis?

# Write your notes here using comments

You may want to take a look at the project rubric!

import pandas as pd
slidesrubric = pd.read_csv('confrenceslidesrubric.csv', keep_default_na=False)
slidesrubric.style.hide(axis='index')
Criteria Category Excellent Good Adequate Poor
Content Reasonable scope The scope of the analysis is clear and questions can be fully addressed using the available data. The scope of the analysis is clear and questions can be reasonably addressed using the available data. The scope of the analysis is less clear, the questions can somewhat be addressed using the available data with slight modifications. The questions are beyond the scope, cannot be reasonably addressed with the available data; need to resort to additional data or complete modification.
Data wrangling Creative use of data wrangling to produce informative variables. Appropriate use of data wrangling to create sensible variables. Some use of data wrangling to create new variables. No evidence of data wrangling to create any variables.
Graphical display Choice of graphs are appropriate and creative; graphs reveal useful information and tell a story. Meaningful captions and titles. Choice of graphs are appropriate; graphs reveal useful information, but are not self-sufficient. Might require some explaining. Choice of graphs are appropriate; graphs reveal some useful information. Might require some explaining and minor changes to titles/axes/labels, etc. A lack of visual aid; graphs are inappropriate, reveal no information.
Statistical methods The choice of method is appropriate; analyses are complete; diverse and creative use of more than one approach. The choice of method is appropriate; some non-essential analyses are missing The choice of method is somewhat appropriate; some analyses are missing. The choice of method is inappropriate; essential analyses are missing.
Appropriate conclusion Results are clearly and completely summarized. Appropriate limitations and concerned clearly stated. Results are completely summarized. Some limitations and concerned are stated. Some results are summarized. The conclusion is not appropriate and no mentioning of any limitations. Results are not summarized and conclusion is missing.
Writing Organization Contents are very well organized under the appropriate section and subsection headings. Contents are organized under the appropriate section and subsection headings. Contents are somewhat organized under section and subsection headings. Contents are poorly organized under section and subsection headings.
Overall Writing Very polished and well written. Few errors in spelling, punctuation, and/or grammar. Mostly clear and understandable. Partly unclear, but mostly understandable. Several errors in spelling, punctuation, and/or grammar. Too many errors in spelling, punctuation, and/or grammar, which make it unclear and difficult to follow.