Weekly Coding Checkup 10

A Semester Project made by Custom Designed Activities

We continue working on the Dr. B & Class project. In this second part of the semester, you will apply what we learned in class in a set of tasks and scenarios. Please remember that it is important that the code that you submit is your own code and not somebody else work. It is fine to make mistakes but only by practicing in R you can get a better grasp of the software.

I also want you to try building your document as an official report for a potential company (Dreaming Diamonds LLC) for which you are getting to know and explore the diamonds dataset (e.g., spend time on storytelling, commenting results and providing insights and conclusions when possible).

Summary of this week tasks:

This week we will work on modeling workflows using tidymodels. Tidymodels is a unified, integrated, and straightforward package that facilitates a smooth approach to data modeling. The bundle package includes individuals packages that enable you to perform necessary steps in a consistent, scalable and flexible way. So, let’s practice what we covered this week:

  • Checking correlations

  • Data Splitting

Tip

Use the diamonds dataset to solve all the below tasks. I have subset it to 15k observations to speed up the modeling process.

Q1:

Write the code necessary to perform the following (2 points):

  • Add a price_log (use base 10) variable to the diamonds dataset. Assign the manipulated dataset to an object called diamonds.

  • Compute the correlation matrix for all the interval variables in the new diamonds dataset. Make sure to name the matrix as diamonds_corr_matrix and to use only complete.obs.

Q2:

Write the code necessary to perform the following (1 points):

  • Visualize the correlation matrix created in the previous task with a correlation chart.

Q3:

What did you learn from the correlation analysis? Provide your interpretation of the correlation matrix values and the correlation matrix chart (assume that price is your dependent variable)- (2 points).

Q4:

Write the code necessary to perform the following (3 points):

  • Set a seed for the data splitting task (pick any number you want).

  • Then create an initial split that allocates 85% of the data to the training set and 15% to the test set.

  • Create the training and test set. Make sure to name the training and test set as “diamonds_train” and “diamonds_test”.

Q4b:

How many observations do you have in your train and test sets? (1 point)

🛑 Don’t Click Submit Just Yet 🚧

Please read carefully the below information:

  • Once you have completed all the coding questions, and your confident in your work, copy and paste your responses from the chunk into the form fields below each question.

  • You are responsible for correctly coping and pasting only the required code to solve each question. We will grade only what you have submitted!

  • We will only grade 1 submission per student so do not click Submit until you are confident in your responses.

  • By submitting this form you are certifying that you have followed the academic integrity guidelines available in the syllabus. The code and answers submitted are the results of your work and your work only!

  • Make sure you have completed all the questions and included all the required personal information (e.g., full name, email, zid) in the respective form’s fields.

  • Now you are ready to click the above “Submit” button. Congrats you have completed your weekly coding check up!!!