Stardew Valley Recipe Finder: An Interactive Cooking Assistant Built with R and Shiny

Sta 523 - Final Project

Authors

Wenjie Gong

Cecilia Liu

Simeng Wu

Carol Zhou

Franklin Zhou

Introduction

Cooking is a central gameplay mechanic in Stardew Valley, offering players a wide variety of dishes that provide strategic buffs, restore health and energy, and support decision-making across different in-game activities. However, detailed recipe information is often scattered or partially hidden behind gameplay progression, making it difficult for players to search efficiently or analyze patterns across the full recipe set. Players commonly wish to identify recipes based on available ingredients, evaluate buff efficiency, or optimize crafting decisions, yet existing community resources are typically static lists with limited interactivity.

To address these limitations, our project constructs a Stardew Kitchen Recipe Explorer that automatically collects, structures, and analyzes cooking-related data from the official Stardew Valley Wiki. Using automated web scraping, we build a clean dataset of recipes, ingredients, buffs, and metadata. We further analyze gameplay patterns such as ingredient frequency, co-occurrence, and buff characteristics. These results are presented through an interactive Shiny application that allows users to search recipes by ingredient, identify dishes they can already prepare, explore recipes requiring only one or two additional items, and examine overall ingredient coverage and usage patterns across all recipes.

The plan for this project focused on automated data collection and interactive exploration, and the completed work includes a full English-language scraping pipeline alongside data cleaning, structuring, and application development. During implementation, we observed substantial variability in the wiki’s nested HTML structures, highlighting the need for careful parsing and data validation during scraping.

Finally, we briefly situate the project within its technical background. Web scraping provides a systematic method for extracting structured information from complex online documents, but wiki pages often contain inconsistent formatting that requires robust parsing techniques. Shiny, as an R-based framework for interactive visualization and user-driven exploration, is well suited for building tools that allow players to dynamically filter, inspect, and analyze recipe information.

All recipe descriptions, images, and ingredient information were sourced from the official Stardew Valley Wiki (https://stardewvalleywiki.com/Cooking).

Methods / Implementation

To build a dataset of cooking recipes in Stardew Valley, we develop a custom web scraper in R (scraping_with_images.R). The scraper systematically extracts recipe metadata, ingredient lists, buff descriptions, and associated images directly from the official Stardew Valley Wiki. Because the Cooking page also contains multiple ingredients tables (e.g., crops, tree fruit, and animal products), it is essential to ensure that only the main Cooking Recipes table is retained, rather than mixing all tables with similar HTML structure into the same dataset. The scraping workflow consists of four main components:

  1. Page access and preparation

    Before scraping, the script automatically initializes a folder structure for storing data and images. Two helper functions are used in this step:

    • download_image(): downloads recipe and ingredient images with error handling and logging.
    • clean_filename(): converts recipe and ingredient names into filesystem-safe filenames for local storage.
  2. Selecting only recipe rows

    The Cooking page contains several sortable tables (<table class="wikitable sortable">). To avoid scraping unrelated tables, we first apply a structural filter:

    • extract all rows from all sortable wikitables;
    • retain only rows with at least 9 cells, which matches the column structure of the Cooking Recipes table.

    This step narrows the candidate rows but may still include a small number of non-recipe rows. Therefore, in a second filtering step we keep only those rows for which ingredient lists are successfully parsed, yielding the correct total of 81 recipes.

  3. Parsing recipe metadata

    For each filtered recipe row, we extract: name, description, image URL, energy and health values, buff effects and duration, recipe source, and selling price. These fields are obtained using fixed column positions (td:nth-child(n)), which are stable within the Cooking Recipes table.

  4. Downloading recipe and ingredient images

    For every recipe and every unique ingredient, the corresponding wiki page is accessed, an appropriate image selector is applied, and the image is downloaded into organized directories. Filenames are normalized with clean_filename() to avoid filesystem issues.
    Finally, all results are saved both as .rds objects and .csv files for downstream analysis and reproducibility.

The cooking webpage also contains 10 different ingredients tables that can be used for allowing users to explorer ingredients in the recipes. To scrape those 10 tables, we built a “crops_info_scraper.R”. It automates the extraction of all ingredient tables from the Stardew Valley Wiki’s Cooking page. It begins by creating local folders to store the output, then reads the web page and locates every relevant table using targeted CSS selectors. For each table, it identifies the nearest category heading, standardizes column names, and converts all rows into tidy data frames. Each category’s data is saved individually in both .csv and .rds formats, while a combined master list of all scraped tables is stored for easy access later. This setup ensures a reproducible, organized dataset of every ingredient table found on the page.

The next major component of the project is the construction of the Shiny application. The workflow consists of the following steps:

  1. Loading Data and Preprocessing

    The application begins by loading all .rds files generated during the scraping workflow, including recipe metadata, ingredient lists, ingredient images, and several category-specific ingredient tables (e.g., crops, fish, forage items, and animal products). From these datasets, the application constructs a sorted list of unique ingredients and extracts buff types from the recipe descriptions.

    To support informative table rendering, ingredient rows are joined with image metadata and converted into HTML snippets that combine an ingredient icon with its name and required quantity. The category-specific ingredient tables are then harmonized into a unified lookup structure using map_dfr(), producing a comprehensive ingredient-to-category mapping that powers search and filtering in the Ingredient Explorer tab. These enriched ingredient entries are also made clickable within the recipe tables, enabling seamless navigation to their corresponding detail views in the Ingredient Explorer.

  2. Helper Functions

    Because the Stardew Valley Wiki includes recipes requiring “Any Fish”, the application must determine whether the player’s ingredient selection satisfies this wildcard condition. A predefined list, fish_list, is therefore used to detect whether the selection contains any valid fish item.

    Two primary helper functions support the recipe-filtering logic:

    • find_makeable_recipes() identifies recipes that can be cooked immediately. It checks whether all required ingredients are present in the user’s selection, incorporates special handling for “Any Fish,” and applies additional filters such as buff type, minimum energy, and minimum health.
    • find_almost_recipes() identifies recipes for which the user is missing only one or two ingredients. It computes missing items, attaches ingredient images and quantities, and formats these items into clickable HTML elements that link directly to the Ingredient Explorer.

    Together, these functions perform the core recipe-matching logic underlying all interactive recipe tables.

  3. User Interface Structure The UI is built using Bootstrap styling through the bslib package and is organized into a sidebar panel and a main panel containing five tabbed sections. The sidebar offers ingredient selection, quick-add ingredient groups, buff and nutrition filters, and reset functionality, accompanied by visual cues such as ingredient counters.

    The main panel consists of five tabbed sections:

    • Can Make Now: Displays recipes that can be prepared immediately based on the selected ingredients
    • Almost There: Shows recipes that require only a few additional ingredients
    • All Recipes: Presents the complete recipe catalog
    • Statistics: Provides visual summaries and analytic insights into the recipe dataset
    • Ingredient Explorer: A searchable, categorized catalog of all ingredients, complete with image thumbnails and structured metadata. Users can navigate to this tab directly by clicking ingredients in any recipe table. This tab allows users to explore the structure of the game’s ingredient system beyond the recipe context, offering a broader view of how items relate to categories such as crops and animal products.
  4. Server Logic and Reactive Programming

    The server function manages all dynamic behavior in the application using Shiny’s reactive programming framework. A key reactive flag, show_all_recipes, determines whether the recipe tables should ignore ingredient selection and display the full recipe catalogue. This value is updated by various UI actions and governs the filtering logic throughout the application.

    Several observeEvent handlers implement the quick-add ingredient buttons (add_staples, add_dairy, add_veggies, and add_all). The first three append predefined ingredient sets to the current selection while ensuring that show_all_recipes is reset to FALSE. The “Show All Recipes” button instead sets show_all_recipes to TRUE and clears the ingredient list, allowing users to browse the entire dataset with optional filters.

    An additional observer monitors the energy, health, and buff filters. When users attempt to apply these filters without selecting ingredients (and without activating the “Show All Recipes” mode), the server issues a contextual notification prompting them to select ingredients before filtering.

    A custom reactive input, switch_to_explorer, integrates the recipe tables with the Ingredient Explorer view. Each ingredient in the recipe tables is rendered as a clickable HTML element that triggers a JavaScript callback. When clicked, the callback sends the ingredient name to the server, which responds by switching the active tab to Ingredient Explorer and pre-populating the ingredient search box. This mechanism provides smooth cross-view navigation and enables users to drill down from a recipe to detailed ingredient information.

    In addition, the Ingredient Explorer is implemented as a category-aware interface. Instead of rendering a single unified ingredient table, the server iterates over all category-specific ingredient tables contained in ingredient_tables (e.g., Crops, Foraged Goods, Animal Products), generating a separate DataTable for each category. This behavior is achieved through a looping structure that dynamically constructs output IDs (such as table_Crops) and assigns a corresponding renderDT() expression to each one. The application can automatically produce a structured multi-table layout in which each category appears under its own heading. This design allows the Ingredient Explorer to group ingredients visually and functionally by category and ensures that the tab responds reactively to user selections.

  5. Interactive Data Table Rendering
    Four interactive tables use DataTables for enhanced presentation:

    • “Can Make Now” Table checks application state and calls find_makeable_recipes(), displays guidance when no ingredients are selected, and of course, embeds recipe images and formatted ingredient lists.
    • “Almost There” Table calls find_almost_recipes() to find recipes missing 1–2 ingredients, dynamically retrieves accurate quantities for missing ingredients, and formats missing ingredients with icons, names, and quantities as well.
    • “All Recipes” Table displays the full recipe catalog, and maintains visual consistency with other tables.
    • “Ingredient Explorer” Table presents ingredient-level information by category, attaching image thumbnails and cleaning auxiliary columns so that only informative fields (e.g., name, category, uses) are shown. This table is also the target view for ingredient clicks originating from the recipe tables.
  6. Visualizations

    The Ingredient Frequency Plot displays the 15 most commonly used ingredients across all recipes using a horizontal bar chart. This reveals which ingredients are most essential in gameplay.

    The Recipe Coverage Plot compares the user’s ability to cook recipes based on their current ingredient selection. Recipes are categorized into Can Make Now, Close (1–2 missing), and Need More Ingredients, with intuitive color coding (green/yellow/red). Together, these visualizations provide analytical insights into recipe patterns and player readiness.

Results

Our web-scraping pipeline extracted a complete set of 81 cooking recipes from the English Stardew Valley Wiki, along with their associated metadata. The cleaned recipes dataset contains one row per recipe and includes standardized fields such as the recipe name, in-game description, nutritional values (energy and health), buff effects, and image URLs. Table 1 presents the first five rows of this dataset to illustrate its structure.

A second dataset, recipe_ingredients, stores the ingredient composition of each dish in a long-format representation, where each row corresponds to a single recipe–ingredient pair and includes the required quantity. Table 2 provides the first five rows as an example of the cleaned structure.

In addition to recipe-level attributes, we also collected detailed ingredient-level metadata to support richer analysis and app functionality. The all_ingredient_tables object contains structured information for various ingredient categories, including crops. These metadata fields describe properties such as harvest season, growth time, processing requirements, and the specific recipes in which each ingredient is used. Table 3 presents the first five rows of the Crops subtable within all_ingredient_tables. Some metadata fields are missing because the source HTML does not provide certain details (e.g., missing notes), rather than due to extraction errors. Together, these three tables summarize the core outputs of the data acquisition and cleaning pipeline.


The main goal of this shiny application is to demonstrate how interactive data applications can turn raw wiki data into an accessible decision tool for players. The resulting application have the following tabs: “Can Make Now”, “Almost There”, “All Recipes”, “Statistics”, and “Ingredient Explorer”.

The application exhibits consistency in its approach to display tables. In the tabs “Can Make Now”, “Almost There”, and “All Recipes”, the displayed table consists of the columns of “Recipe”, “Ingredients”, “Energy”, “Health”, “Buff”, “Duration”, and “Sell Price”, accompanied by a search bar, which enables the users to search for specific keywords.

“Can Make Now” and “Almost There” tabs are the function’s central decision workhorses. Through dynamic filtering, users can quickly identify recipes they can cook based on their current ingredients with “Can Make Now” tab or explore what’s missing to complete a dish with “Almost There” tab (Fig 1. & Fig 2.). The design achieves a perfect balance between simplicity and engagement: Clickable ingredient links connect recipe results with a separate “Ingredient Explorer” tab (Fig 3.), while built-in plots provide visual insight into ingredient frequency and overall recipe coverage (Fig 4.). The two tabs are a user’s best friend when making real-time decisions, by answering questions like “what recipe I can make right now with tomato and beet” or “what recipe I am about to be able to make with apple and sugar handy (and what else I need)”. Through the sidebar, there is another way of filtering recipes. Clicking on “Show All Recipes” and then adjusting the buttons saying “Min Energy” and “Min Health” in the “Can Make Now” tab, the user can use minimum energy and/or minimum health as the filtering criterion to search for recipes that meet their nutritional needs. Searching for desired buff is also possible through the sidebar. For example, after clicking on “Show All Recipes”, all recipes will appear (Fig. 6); if adjusting the “Min Energy” button to 180, only those heavy recipes such as “Complete Breakfast” will appear (Fig. 7).

For accessing the application’s general reference catalog, we have designed an “All Recipes” tab. It is a field guide collecting all recipes in the dataset. The structure of this tab resembles the structure of “Can Make Now” and “Almost There” tabs. This tab is functionally distinct from the “Can Make Now” and “Almost There”, the display of which depends on the user’s ingredient inventory, and “Ingredient Explorer”, which depends on the dynamic filter conditions. The content of this tab is fixed, displaying the complete scrapped dataset, which allows users to appreciate the full scope of recipes in Stardew Valley.

Rather than focusing on the recipes, the “Ingredient Explorer” tab integrates all different kinds of information about ingredients into one searchable interface. It displays information about the ingredients in separate tables by category. This offers the users a streamlined way to connect thoughts of making a particular recipe to understanding of (the origin of) its components.

Figure 1. The side bar allows user to select mutiple ingredients. The main panel shows the recipes ready to be made with selected ingredients, along with related information about each dishes.

Figure 2. The sidebar enables users to refine recipe searches by adjusting sliders for minimum energy and health values, as well as by applying buff filters. The Almost There tab highlights recipes that are only one or two ingredients short of completion, while interactive ingredient links connect directly to the Ingredient Explorer tab, providing detailed information about each ingredient.

Figure 3. The Ingredient Explorer tab displays the selected ingredients in separate tables organized by category, improving readability and navigation. Users can access this tab directly by clicking on ingredient hyperlinks within the Can Make Now, Almost There, or All Recipes tables, allowing for exploration of ingredient details and relationships.

Figure 4. The Statistics tab provides a quantitative overview of recipe coverage and ingredient usage frequency based on the user’s current selections. This visualization helps with resource planning by highlighting which ingredients contribute most to available recipes and identifying potential gaps for efficient inventory management.

Figure 5. The All Recipes tab provides a stable, comprehensive display of the full dataset of all recipes. This tab is tailored for user’s need to browse a full catalog of recipes. The information provided in this tab is not aiming for assisting with real-time decision-making, but allowing users who are interested in learn more about the details of recipes to freely browse.

Figure 6. The Can Make Now tab after clicking on “Show All Recipe”. All recipes in the dataset appears under the tab.

Figure 7. The Can Make Now tab after adjusting the “Min Energy” button to 180. Filtering in this way allow the user to search for recipes that supply a considerable amount of energy in align with some nutrition needs. Notice that the first row changes from “Fried Egg” to “Complete Breakfast”, because only heavy food recipes can supply the desired amount of energy and be chosen.

Discussion

Overall, the project successfully achieved its primary objectives. We developed a reproducible and reliable web-scraping pipeline capable of extracting structured recipe information from the Stardew Valley Wiki, and the resulting dataset integrated smoothly with the Shiny application. The application performed efficiently on the English dataset, offering responsive ingredient-based filtering, recipe exploration, and visual summaries that provided meaningful insights into gameplay patterns such as ingredient frequency and usage distribution.

Several technical challenges emerged during the implementation. Extracting ingredient frequencies and identifying recipe-level optimization opportunities required robust parsing logic due to the wiki’s inconsistent HTML formatting and mixed text-image cells. Additionally, accurately linking ingredient images to their textual representations demanded extensive cleaning and matching procedures to ensure consistency across the dataset. These challenges highlighted limitations inherent to working with community-maintained webpages, where structural variability introduces uncertainty into automated pipelines.

The project provides a valuable contribution by consolidating scattered recipe information into a structured, interactive platform. The analysis components, such as ingredient frequency and recipe coverage, offer players data-driven perspectives on gameplay strategy, while the modular data structure establishes a foundation for future research on cross-recipe patterns and gameplay optimization.

Looking ahead, future work will focus on expanding both the accessibility and analytical sophistication of the platform. One direction could be the implementation of multilingual support, including cross-language data scraping and the development of a standardized translation dictionary, to better accommodate the global Stardew Valley player community. Although this direction was outlined in the original project proposal, only the English dataset was fully implemented in the current phase due to time constraints and the substantial complexity associated with cross-language data integration. Methodologically, advancing this component will require designing language-specific parsing rules and developing robust alignment strategies capable of reconciling structural and semantic differences across wikis.

In addition to multilingual functionality, the platform could be strengthened by incorporating more advanced analytical modules, such as quantitative buff-efficiency metrics and seasonal ingredient availability modeling. These extensions would deepen the system’s capacity for strategic evaluation, offering users more nuanced insights into recipe optimization and resource planning.

Reproducibility Notes

Step 1: Make sure you have all the required packages installed: bslib, dplyr, DT, ggplot2, purrr, rvest, shiny, and stringr.

Step 2: Run sraping_with_images.R if www and data folders are missing.

Step 3: Run crops_info_scraper.R if data/ingredient_tables is missing.

Step 4: Run our Shiny Application (shiny_app_final.R) and play around with it!

Note: Everything is fully reproducible assuming wiki structure remains unchanged.