Here are Zhicong and Tianxin (Jessica).

Email: zhicong@umich.edu & ltianxin@umich.edu

High or Low? Predict Recipes Rating with Complexity & Nutritional Health Information

Introduction

This project uses the Recipes and Ratings dataset from Food.com, which combines two tables of user‑submitted recipes and their corresponding reviews originally scraped for a recommender‑systems study. And We are interested in exploring:

How do a recipe’s nutritional health information and complexity relate to its ratings?

Why it matters

Home cooks, nutritionists, and recipe‑platform designers should care because understanding these relationships can:

  • Help users discover recipes they’re more likely to enjoy
  • Support healthier choices by revealing how nutrition influences ratings
  • Optimize recommendation engines for both taste and health

Preparing the data

  • Merging We left merged the datasets to get all reviews of recipes included in the recipes dataset.
  • Zero-ratings Change to NaN In reality, the lowest score a reviewer can actively give is 1. A recorded value of 0 almost always means no rating was actually provided even though the review text was submitted. We therefore convert rating = 0 to NaN before analysis so that missing opinions don’t bias averages.
  • Averaging Rating We calculated the average rating per recipe and added it as a new column.

Dataset Size

  • Number of unique recipes (rows): 234429
  • Number of columns in the merged dataset: 17

Relevant Columns & Descriptions

Column Description
minutes Preparation time in minutes
n_steps Number of instruction steps in the recipe
n_ingredients Total count of distinct ingredients used
tags Food.com tags for recipe (which contains some health-related tags)
nutrition List [calories (#), total fat (PDV), sugar (PDV), sodium (PDV), protein (PDV), saturated fat (PDV), carbohydrates (PDV)]; PDV = % daily value
avg_rating Mean user rating (1–5), computed by averaging all nonzero ratings submitted

These are the features we are interested to investigate. By understanding how preparation complexity and nutritional makeup jointly drive popularity, we aim to build a recommender that balances taste‑appeal with health considerations.