| Title: | Interactive 'shiny' GUI for the 'earth' Package |
|---|---|
| Description: | Provides a 'shiny'-based graphical user interface for the 'earth' package, enabling interactive building and exploration of Multivariate Adaptive Regression Splines (MARS) models. Features include data import from CSV and 'Excel' files, automatic detection of categorical variables, interactive control of interaction terms via an allowed matrix, comprehensive model diagnostics with variable importance and partial dependence plots, and publication-quality report generation via 'Quarto'. |
| Authors: | William Craytor [aut, cre] |
| Maintainer: | William Craytor <[email protected]> |
| License: | AGPL (>= 3) |
| Version: | 0.8.0 |
| Built: | 2026-06-02 10:17:58 UTC |
| Source: | https://github.com/wcraytor/earthui |
Saves the result via saveRDS() to
<file_name>_earthUI_result_<timestamp>.rds and verifies the file is
readable and can produce a prediction. If verification fails, the file
is deleted. Skipped silently for models with degree > 2 (mgcvUI only
supports pairwise interactions).
auto_export_for_mgcv(result, output_folder, file_name)auto_export_for_mgcv(result, output_folder, file_name)
result |
A fit result list as returned by |
output_folder |
Character scalar. May be |
file_name |
Character scalar. Used to derive the output filename. |
Invisibly, NULL.
Converts an allowed interaction matrix into a function compatible with
the allowed parameter of earth::earth(). The function checks that
ALL pairwise combinations among the predictors in a proposed interaction
term are TRUE in the matrix.
build_allowed_function(allowed_matrix, block_degree1 = NULL)build_allowed_function(allowed_matrix, block_degree1 = NULL)
allowed_matrix |
A symmetric logical matrix as returned by
|
block_degree1 |
Optional character vector of predictor names to
block from entering the model as degree-1 (main effect) terms. These
variables can still participate in interactions (degree >= 2). This is
useful when a variable like |
The returned function implements the standard earth() allowed function
contract. When earth proposes a new hinge function involving predictor
pred with existing parent predictors indicated by the parents logical
vector, the function checks that every pair of involved predictors is
allowed in the matrix.
For a 3-way interaction between X, Y, Z, the function verifies that (X,Y), (Y,Z), and (X,Z) are all TRUE in the matrix.
When block_degree1 is specified, any predictor in that list is blocked
from entering as a degree-1 term but is allowed in higher-degree
interactions (subject to the allowed matrix).
A function with signature
function(degree, pred, parents, namesx, first) suitable for the
allowed parameter of earth::earth().
mat <- build_allowed_matrix(c("sqft", "bedrooms", "pool")) mat["sqft", "pool"] <- FALSE mat["pool", "sqft"] <- FALSE func <- build_allowed_function(mat) # Block sale_age from degree 1 (interaction only) mat2 <- build_allowed_matrix(c("sale_age", "living_area", "lot_size")) func2 <- build_allowed_function(mat2, block_degree1 = "sale_age")mat <- build_allowed_matrix(c("sqft", "bedrooms", "pool")) mat["sqft", "pool"] <- FALSE mat["pool", "sqft"] <- FALSE func <- build_allowed_function(mat) # Block sale_age from degree 1 (interaction only) mat2 <- build_allowed_matrix(c("sale_age", "living_area", "lot_size")) func2 <- build_allowed_function(mat2, block_degree1 = "sale_age")
Creates a symmetric logical matrix indicating which pairs of predictors are allowed to interact. By default, all interactions are allowed.
build_allowed_matrix(variable_names, default = TRUE)build_allowed_matrix(variable_names, default = TRUE)
variable_names |
Character vector of predictor variable names. |
default |
Logical. Default value for all entries. Default is |
A symmetric logical matrix with variable_names as both row and
column names.
mat <- build_allowed_matrix(c("sqft", "bedrooms", "pool")) mat["sqft", "pool"] <- FALSE mat["pool", "sqft"] <- FALSE matmat <- build_allowed_matrix(c("sqft", "bedrooms", "pool")) mat["sqft", "pool"] <- FALSE mat["pool", "sqft"] <- FALSE mat
Generates a multi-sheet xlsx workbook formatted as a Sales Comparison Grid from an RCA-adjusted data frame. Each sheet shows the subject and up to three comparable sales side by side, with factual values, value contributions, and adjustments per regression variable, plus rows for grouped variables (location, site, age), residual feature inputs, and an Adjusted Sale Price formula.
build_sales_grid( rca_df, comp_rows, output_file, specials = list(), title_prefix = "Intermediate Sales Comparable Grid", progress_fn = NULL )build_sales_grid( rca_df, comp_rows, output_file, specials = list(), title_prefix = "Intermediate Sales Comparable Grid", progress_fn = NULL )
rca_df |
A data frame produced by the RCA workflow. Row 1 is the
subject; rows 2+ are comps. Must contain columns produced by
|
comp_rows |
Integer vector of row numbers (>= 2) to include in the grid. Maximum 30 (10 sheets, 3 comps per sheet). |
output_file |
Character scalar. Destination xlsx path. |
specials |
Named list mapping a special type
(e.g. |
title_prefix |
Character scalar. Sheet-title prefix. Defaults to
|
progress_fn |
Optional function called after each sheet is written
with arguments |
This is the non-Shiny computation kernel used by the earthUI Shiny app's
Sales Grid download button, and is also suitable for use from batch
scripts that already have rca_df in memory.
Invisibly, the output_file path.
Applies the rule: lowercase + strip all non-alphanumerics + first 6
characters. If the resulting code already exists in existing_codes,
append _1, _2, ... until unique.
city_abbreviation(full_name, existing_codes = character(0))city_abbreviation(full_name, existing_codes = character(0))
full_name |
Character scalar (e.g. |
existing_codes |
Character vector of codes already in the parent
folder. Defaults to |
Character scalar.
Returns an enhanced copy of data with per-target model columns appended:
est_<target>, residual, cqa, optional residual_sf / cqa_sf (when
a living-area column is supplied), per-g-function <var>_contribution
columns, basis, and calc_residual. For appraisal or market purposes,
rows are sorted by residual_sf (or residual) descending, with the
subject row (row 1) pinned on top when applicable. Ranking columns
(residual_sf, cqa_sf, residual, cqa) are moved to the leftmost
positions.
compute_intermediate_output( data, result = NULL, purpose = c("general", "appraisal", "market"), skip_subject_row = FALSE, living_area_col = NULL )compute_intermediate_output( data, result = NULL, purpose = c("general", "appraisal", "market"), skip_subject_row = FALSE, living_area_col = NULL )
data |
A data frame (the raw imported data). Must contain the target
column(s) named in |
result |
A fit result list as returned by |
purpose |
Character scalar: |
skip_subject_row |
Logical. In |
living_area_col |
Character scalar or |
This is the non-Shiny computation kernel used by the earthUI Shiny app's Download Intermediate Output button, and is also suitable for use from batch scripts.
A data frame.
Starting from the raw data (subject in row 1, comps in rows 2+) and a
fitted earth model, produces the adjusted comparables data frame used by
the Sales Comparison Grid. The user-supplied subject CQA score is
converted into a subject residual via linear interpolation of the comp
CQA/residual curve; per-g-function contributions and adjustments are
then added for each comp, along with net/gross adjustments, percentages,
and the final adjusted_sale_price.
compute_rca_adjustments( data, result, user_cqa, cqa_type = c("cqa", "cqa_sf"), living_area_col = NULL, weight_col = NULL )compute_rca_adjustments( data, result, user_cqa, cqa_type = c("cqa", "cqa_sf"), living_area_col = NULL, weight_col = NULL )
data |
A data frame (subject + comps) matching |
result |
A fit result list as returned by |
user_cqa |
Numeric scalar in |
cqa_type |
Character: |
living_area_col |
Character scalar or |
weight_col |
Character scalar or |
For multi-target models, the primary target drives the subject residual calculation. Additional targets receive their own residual interpolation and adjustment columns, imputed only for zero-weight rows.
This is the non-Shiny computation kernel used by the earthUI Shiny app's RCA Raw Output button, and is also suitable for use from batch scripts.
An enhanced data frame with model columns, contributions,
adjustments, subject_value, subject_cqa, and
adjusted_sale_price.
Given a vector of contract (sale) dates and a single effective date, returns
the difference in integer days (effective_date - contract_date). Contract
dates may be supplied as POSIXct, Date, character strings parseable by
as.POSIXct(), or numeric Excel serial date numbers (origin 1899-12-30).
compute_sale_age(contract_vals, effective_date)compute_sale_age(contract_vals, effective_date)
contract_vals |
A vector of contract/sale dates. Accepted types:
|
effective_date |
The effective (appraisal) date. Accepted types:
|
This function is the non-Shiny computation kernel used by the
earthUI Shiny app when computing a sale_age column from a designated
contract_date column. It is also suitable for use from batch scripts.
An integer vector the same length as contract_vals, giving the
number of whole days between effective_date and each contract date.
NA contract values propagate to NA results.
compute_sale_age( contract_vals = as.Date(c("2024-01-15", "2024-06-01")), effective_date = as.Date("2025-01-01") )compute_sale_age( contract_vals = as.Date(c("2024-01-15", "2024-06-01")), effective_date = as.Date("2025-01-01") )
Renders any .qmd file (not just earthUI-generated ones) to the
requested formats via quarto::quarto_render(). Useful for
converting hand-edited or manually-combined Quarto reports — e.g.,
a master document that uses {{< include >}} to pull in multiple
project reports.
convert_quarto_file( qmd_path, formats = c("html"), output_dir = NULL, paper_size = "letter" )convert_quarto_file( qmd_path, formats = c("html"), output_dir = NULL, paper_size = "letter" )
qmd_path |
Path to a Quarto source ( |
formats |
Character vector of output formats. Any subset of
|
output_dir |
Directory to write the rendered output(s). Defaults
to the same directory as |
paper_size |
Character: |
Invisibly, a character vector of output file paths.
Returns a named character vector of country display names indexed by
ISO 3166-1 alpha-2 code. Suitable for a selectInput choices list.
country_choices()country_choices()
Named character vector. Names are display names, values are lowercase 2-letter codes.
Returns the ordered character vector of admin level labels for the
given ISO 3166-1 alpha-2 country code. Used by the UI cascade and by
regproj_path() to validate path depth.
country_schema(cc)country_schema(cc)
cc |
Character scalar. Lowercase ISO 3166-1 alpha-2 country code
(e.g. |
Unknown country codes return a generic 2-level fallback
(c("region", "city")) so users in countries not yet covered by the
shipped table still get a sensible cascade.
Character vector of admin level labels, top to bottom.
Resolution order:
REGPROJ_ROOT environment variable.
regproj_root field in user prefs (earthui_prefs_path()).
Per-OS default: C:/regProj on Windows; ~/regProj elsewhere.
default_regproj_root()default_regproj_root()
Character scalar. Absolute path.
Returns a logical named vector indicating which columns are likely
categorical. Character and factor columns are always flagged. Numeric
columns with fewer than max_unique unique values are also flagged.
detect_categoricals(df, max_unique = 10L)detect_categoricals(df, max_unique = 10L)
df |
A data frame. |
max_unique |
Integer. Numeric columns with this many or fewer unique values are flagged as likely categorical. Default is 10. |
A named logical vector with one element per column. TRUE indicates
the column is likely categorical.
df <- data.frame( price = c(100, 200, 300, 400), pool = c("Y", "N", "Y", "N"), bedrooms = c(2, 3, 2, 4), sqft = c(1200, 1500, 1300, 1800) ) detect_categoricals(df)df <- data.frame( price = c(100, 200, 300, 400), pool = c("Y", "N", "Y", "N"), bedrooms = c(2, 3, 2, 4), sqft = c(1200, 1500, 1300, 1800) ) detect_categoricals(df)
Inspects each column and returns a best-guess R type string. Character columns are tested for common date patterns. Numeric columns containing only 0/1 values (with both present) are flagged as logical.
detect_types(df)detect_types(df)
df |
A data frame. |
A named character vector with one element per column.
Possible values: "numeric", "integer", "character",
"logical", "factor", "Date", "POSIXct",
"unknown".
df <- data.frame( price = c(100.5, 200.3, 300.1), rooms = c(2L, 3L, 4L), pool = c("Y", "N", "Y"), sold = c(TRUE, FALSE, TRUE) ) detect_types(df)df <- data.frame( price = c(100.5, 200.3, 300.1), rooms = c(2L, 3L, 4L), pool = c("Y", "N", "Y"), sold = c(TRUE, FALSE, TRUE) ) detect_types(df)
Returns the path to <R_user_dir("earthUI","config")>/prefs.json. The
file holds user-level configuration that lives outside the regProj
tree itself — most importantly, the location of the regProj root.
earthui_prefs_path()earthui_prefs_path()
Character scalar.
Read user preferences (returns empty list if file missing)
earthui_prefs_read()earthui_prefs_read()
Named list.
Write user preferences (atomic; creates the config dir if needed)
earthui_prefs_write(prefs)earthui_prefs_write(prefs)
prefs |
Named list to save. |
Invisibly, the prefs path.
The Shiny app persists per-file, per-purpose settings in a SQLite DB
keyed by "<filename>||<purpose>". export_settings() reads that row
and writes a single JSON file containing the full settings bundle
(target, earth parameters, variable selections, type/special overrides,
and interactions), plus an rca block for batch RCA inputs — the
subject CQA score and CQA score type.
export_settings(filename, purpose, output_json)export_settings(filename, purpose, output_json)
filename |
Character scalar. The filename as stored in the DB
(e.g. |
purpose |
Character scalar: |
output_json |
Character scalar. Destination file path ( |
If output_json already exists, the rca block of the existing file
is preserved — re-exporting from the UI does not clobber hand-edited
CQA inputs.
The emitted rca block has two fields:
null or a number in [0.00, 10.00].
"CQA/sf" (based on residual / living-area,
default) or "CQA" (based on residual).
The emitted reports field is an array of formats to render in batch
mode: any subset of "html", "pdf", "docx". An empty array []
(default) means no reports are generated.
Invisibly, the output_json path.
## Not run: export_settings("Appraisal_1.csv", "appraisal", "~/configs/Appraisal_1.json") ## End(Not run)## Not run: export_settings("Appraisal_1.csv", "appraisal", "~/configs/Appraisal_1.json") ## End(Not run)
Wrapper around earth::earth() with parameter validation and automatic
cross-validation when interaction terms are enabled.
fit_earth( df, target, predictors, categoricals = NULL, linpreds = NULL, type_map = NULL, degree = 1L, allowed_func = NULL, allowed_matrix = NULL, nfold = NULL, nprune = NULL, thresh = NULL, penalty = NULL, minspan = NULL, endspan = NULL, fast.k = NULL, pmethod = NULL, glm = NULL, trace = NULL, nk = NULL, newvar.penalty = NULL, fast.beta = NULL, ncross = NULL, stratify = NULL, varmod.method = NULL, varmod.exponent = NULL, varmod.conv = NULL, varmod.clamp = NULL, varmod.minspan = NULL, keepxy = NULL, Scale.y = NULL, Adjust.endspan = NULL, Auto.linpreds = NULL, Force.weights = NULL, Use.beta.cache = NULL, Force.xtx.prune = NULL, Get.leverages = NULL, Exhaustive.tol = NULL, wp = NULL, weights = NULL, ..., .capture_trace = TRUE )fit_earth( df, target, predictors, categoricals = NULL, linpreds = NULL, type_map = NULL, degree = 1L, allowed_func = NULL, allowed_matrix = NULL, nfold = NULL, nprune = NULL, thresh = NULL, penalty = NULL, minspan = NULL, endspan = NULL, fast.k = NULL, pmethod = NULL, glm = NULL, trace = NULL, nk = NULL, newvar.penalty = NULL, fast.beta = NULL, ncross = NULL, stratify = NULL, varmod.method = NULL, varmod.exponent = NULL, varmod.conv = NULL, varmod.clamp = NULL, varmod.minspan = NULL, keepxy = NULL, Scale.y = NULL, Adjust.endspan = NULL, Auto.linpreds = NULL, Force.weights = NULL, Use.beta.cache = NULL, Force.xtx.prune = NULL, Get.leverages = NULL, Exhaustive.tol = NULL, wp = NULL, weights = NULL, ..., .capture_trace = TRUE )
df |
A data frame containing the modeling data. |
target |
Character string. Name of the response variable. |
predictors |
Character vector. Names of predictor variables. |
categoricals |
Character vector. Names of predictors to treat as
categorical (converted to factors before fitting). Default is |
linpreds |
Character vector. Names of predictors constrained to enter
the model linearly (no hinge functions). Default is |
type_map |
Named list or character vector. Maps column names to
declared types (e.g., |
degree |
Integer. Maximum degree of interaction. Default is 1 (no interactions). When >= 2, cross-validation is automatically enabled. |
allowed_func |
Function or |
allowed_matrix |
Logical matrix or |
nfold |
Integer. Number of cross-validation folds. Automatically set
to 10 when |
nprune |
Integer or |
thresh |
Numeric. Forward stepping threshold. Default is earth's default (0.001). |
penalty |
Numeric. Generalized cross-validation penalty per knot.
Default is earth's default (if |
minspan |
Integer or |
endspan |
Integer or |
fast.k |
Integer. Maximum number of parent terms considered at each step of the forward pass. Default is earth's default (20). |
pmethod |
Character. Pruning method. One of |
glm |
List or |
trace |
Numeric. Trace earth's execution. 0 (default) = none, 0.3 = variance model, 0.5 = cross validation, 1-5 = increasing detail. |
nk |
Integer or |
newvar.penalty |
Numeric or |
fast.beta |
Numeric or |
ncross |
Integer or |
stratify |
Logical or |
varmod.method |
Character or |
varmod.exponent |
Numeric or |
varmod.conv |
Numeric or |
varmod.clamp |
Numeric or |
varmod.minspan |
Integer or |
keepxy |
Logical or |
Scale.y |
Logical or |
Adjust.endspan |
Numeric or |
Auto.linpreds |
Logical or |
Force.weights |
Logical or |
Use.beta.cache |
Logical or |
Force.xtx.prune |
Logical or |
Get.leverages |
Logical or |
Exhaustive.tol |
Numeric or |
wp |
Numeric vector or |
weights |
Numeric vector or |
... |
Additional arguments passed to |
.capture_trace |
Logical. If |
A list with class "earthUI_result" containing:
The fitted earth model object.
Name of the response variable.
Names of predictor variables used.
Names of categorical predictors.
Degree of interaction used.
Logical; whether cross-validation was used.
The data frame used for fitting.
# Using the included demo appraisal dataset demo_file <- system.file("extdata", "Appraisal_1.csv", package = "earthUI") df <- import_data(demo_file) result <- fit_earth(df, target = "sale_price", predictors = c("living_sqft", "lot_size", "age")) format_summary(result)# Using the included demo appraisal dataset demo_file <- system.file("extdata", "Appraisal_1.csv", package = "earthUI") df <- import_data(demo_file) result <- fit_earth(df, target = "sale_price", predictors = c("living_sqft", "lot_size", "age")) format_summary(result)
Extracts the ANOVA table from a fitted earth model.
format_anova(earth_result)format_anova(earth_result)
earth_result |
An object of class |
A data frame with the ANOVA decomposition showing which predictors contribute to each basis function and their importance.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) format_anova(result)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) format_anova(result)
Converts a fitted earth model into a LaTeX-formatted mathematical representation using g-function notation. Basis functions are grouped by degree (constant, first-degree, second-degree, third-degree) and labeled with indices that encode the group, position, and factor variable count.
format_model_equation(earth_result, digits = 7L, response_idx = NULL)format_model_equation(earth_result, digits = 7L, response_idx = NULL)
earth_result |
An object of class |
digits |
Integer. Number of significant digits for coefficients and cut points. Default is 7. |
response_idx |
Integer or |
A list containing:
Character string. LaTeX array environment for HTML/MathJax rendering.
Character string. Wrapped in display math delimiters for MathJax/HTML rendering.
Character string. LaTeX for native PDF output with escaped special characters in text blocks.
Character string. LaTeX for Word/docx output.
List of group structures for programmatic access.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) eq <- format_model_equation(result) cat(eq$latex)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) eq <- format_model_equation(result) cat(eq$latex)
Extracts key statistics from a fitted earth model including coefficients, basis functions, R-squared, GCV, GRSq, and RSS.
format_summary(earth_result)format_summary(earth_result)
earth_result |
An object of class |
A list containing:
Data frame of model coefficients and basis functions.
Training R-squared.
Generalized cross-validation value.
Generalized R-squared (1 - GCV/variance).
Residual sum of squares.
Number of terms in the pruned model.
Number of predictors used in the final model.
Number of observations.
Cross-validated R-squared (if CV was used, else NA).
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) summary_info <- format_summary(result) summary_info$r_squaredresult <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) summary_info <- format_summary(result) summary_info$r_squared
Extracts variable importance scores from a fitted earth model using
earth::evimp().
format_variable_importance(earth_result)format_variable_importance(earth_result)
earth_result |
An object of class |
A data frame with columns variable, nsubsets, gcv, and rss,
sorted by overall importance (nsubsets).
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) format_variable_importance(result)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) format_variable_importance(result)
Writes a self-contained Quarto project under <dest_dir>/<base>_qmd/
containing the populated <base>.qmd source, all pre-generated plots
(PNG + PDF), the report data RDS, and reference.docx for Word
rendering. The resulting bundle can be edited or combined with other
Quarto sources, then rendered to HTML / Word / PDF via
convert_quarto_file().
generate_quarto_report(earth_result, dest_dir, base = "earth_report")generate_quarto_report(earth_result, dest_dir, base = "earth_report")
earth_result |
An object of class |
dest_dir |
Directory to write the bundle into. The bundle itself
lives at |
base |
Bundle base name (no extension). The .qmd file inside is
named |
Use this when you want the Quarto source as a first-class artifact — e.g., to combine reports from multiple projects into a master document before publishing.
Invisibly, the absolute path to the generated .qmd file.
Reads model settings for a project from <REGPROJ_ROOT>/projects.sqlite.
Used by the Shiny UI to restore settings on file open, and by external
tools (ValEngr, batch scripts) to inspect project state. Settings are
scoped by project + purpose and are shared across all data files in the
project (a small test extract and the full dataset share one config).
get_project_settings( project_path, method = "earth", purpose = "general", root = default_regproj_root() )get_project_settings( project_path, method = "earth", purpose = "general", root = default_regproj_root() )
project_path |
Absolute path to the project root. |
method |
One of |
purpose |
Settings scope: one of |
root |
regProj root. Defaults to |
Named list with settings, variables, and (for earth)
interactions (each a JSON string), or NULL if no row exists.
Reads a CSV (.csv) or 'Excel' (.xlsx, .xls) file and returns a data frame. Column names are converted to snake_case and duplicates are made unique.
import_data(filepath, sheet = 1, sep = ",", dec = ".", ...)import_data(filepath, sheet = 1, sep = ",", dec = ".", ...)
filepath |
Character string. Path to the data file. Supported formats:
|
sheet |
Character or integer. For Excel files, the sheet to read. Defaults to the first sheet. Ignored for CSV files. |
sep |
Character. Field separator for CSV files. Default |
dec |
Character. Decimal separator for CSV files. Default |
... |
Additional arguments passed to |
A data frame with column names converted to snake_case. Duplicate column names are made unique by appending numeric suffixes.
# Load the included demo appraisal dataset demo_file <- system.file("extdata", "Appraisal_1.csv", package = "earthUI") df <- import_data(demo_file) head(df)# Load the included demo appraisal dataset demo_file <- system.file("extdata", "Appraisal_1.csv", package = "earthUI") df <- import_data(demo_file) head(df)
Returns TRUE if path exists, sits two directories deep under root
(<purpose>/<flat_segment>/), and the flat segment parses cleanly.
is_project_dir(path, root = default_regproj_root())is_project_dir(path, root = default_regproj_root())
path |
Absolute path to check. |
root |
regProj root. Defaults to |
Logical scalar.
Opens an interactive 'shiny' GUI for building and exploring 'earth' (MARS-style) models. The application provides data import, variable configuration, model fitting, result visualization, and report export.
launch(port = 7878L, ...)launch(port = 7878L, ...)
port |
Integer. Port number for the Shiny app. Defaults to 7878. A fixed port keeps browser-side UI preferences (theme, last-used purpose) consistent across sessions. (Model configuration is saved server-side in the project database, not in the browser.) |
... |
Additional arguments passed to |
This function does not return a value; it launches the Shiny app.
if (interactive()) { launch() }if (interactive()) { launch() }
Returns a data frame describing each non-intercept g-function group from the
model equation, including degree, factor count, graph dimensionality, and
the number of terms. The g-function notation is
where f = number of factor variables
(top-left), j = degree of interaction (top-right), k = position within the
degree group (bottom-right).
list_g_functions(earth_result)list_g_functions(earth_result)
earth_result |
An object of class |
A data frame with columns:
Integer. Sequential index (1-based).
Character. Variable names in the group.
Integer. Degree of the g-function (top-right superscript).
Integer. Position within the degree (bottom-right subscript).
Integer. Number of factor variables (top-left superscript).
Integer. Graph dimensionality (degree minus factor count).
Integer. Number of terms in the group.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) list_g_functions(result)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) list_g_functions(result)
Returns "mac" on Darwin, "ubuntu" on Linux, "win11" on Windows.
This is the value used as the <os> segment in regProj paths so that
multi-OS output can be merged cleanly on a developer's machine.
os_detect()os_detect()
Character scalar: "mac", "ubuntu", or "win11".
Creates a scatter plot of actual vs predicted values with a 1:1 reference line.
plot_actual_vs_predicted(earth_result, response_idx = NULL)plot_actual_vs_predicted(earth_result, response_idx = NULL)
earth_result |
An object of class |
response_idx |
Integer or |
A ggplot2::ggplot object.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_actual_vs_predicted(result)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_actual_vs_predicted(result)
Creates a scatter plot showing each variable's actual contribution to the prediction. For each observation, the contribution is the sum of coefficient * basis function value across all terms involving that variable.
plot_contribution(earth_result, variable, response_idx = NULL)plot_contribution(earth_result, variable, response_idx = NULL)
earth_result |
An object of class |
variable |
Character string. Name of the predictor variable to plot. |
response_idx |
Integer or |
A ggplot2::ggplot object.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_contribution(result, "wt")result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_contribution(result, "wt")
Creates a heatmap of pairwise correlations among the target variable and numeric predictors, with cells colored by degree of correlation and values printed in each cell.
plot_correlation_matrix(earth_result)plot_correlation_matrix(earth_result)
earth_result |
An object of class |
A ggplot2::ggplot object.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_correlation_matrix(result)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_correlation_matrix(result)
Creates a ggplot2 visualization for any g-function group. For d <= 1,
produces a 2D scatter plot (same as plot_g_function()). For d >= 2,
produces a filled contour plot suitable for static formats like PDF and Word.
plot_g_contour(earth_result, group_index, response_idx = NULL)plot_g_contour(earth_result, group_index, response_idx = NULL)
earth_result |
An object of class |
group_index |
Integer. Index of the g-function group (1-based, from
|
response_idx |
Integer or |
A ggplot2::ggplot object.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_g_contour(result, 1)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_g_contour(result, 1)
Creates a contribution plot for a specific g-function group. For degree-1 groups (single variable), produces a 2D scatter + piecewise-linear plot with slope labels and knot markers. For degree-2 groups (two variables), produces a 3D surface plot using plotly if available, or a filled contour plot.
plot_g_function(earth_result, group_index, response_idx = NULL)plot_g_function(earth_result, group_index, response_idx = NULL)
earth_result |
An object of class |
group_index |
Integer. Index of the g-function group (1-based, from
|
response_idx |
Integer or |
A ggplot2::ggplot object for d <= 1, or a plotly widget for d >= 2 (when plotly is installed). Falls back to ggplot2 contour if plotly is not available.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_g_function(result, 1)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_g_function(result, 1)
Creates a base R persp() 3D surface plot for g-function groups with d >= 2.
For d <= 1, produces a 2D scatter plot (same as plot_g_function()).
The surface is colored by contribution value using a blue-white-red scale.
Suitable for PDF and Word output where interactive plotly is not available.
plot_g_persp( earth_result, group_index, theta = 30, phi = 25, response_idx = NULL )plot_g_persp( earth_result, group_index, theta = 30, phi = 25, response_idx = NULL )
earth_result |
An object of class |
group_index |
Integer. Index of the g-function group (1-based, from
|
theta |
Numeric. Azimuthal rotation angle in degrees. Default 30. |
phi |
Numeric. Elevation angle in degrees. Default 25. |
response_idx |
Integer or |
Invisible NULL (base graphics). For d <= 1, returns a ggplot object.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"), degree = 2L) plot_g_persp(result, 1)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"), degree = 2L) plot_g_persp(result, 1)
Creates a partial dependence plot for a selected variable from a fitted earth model.
plot_partial_dependence( earth_result, variable, n_grid = 50L, response_idx = NULL )plot_partial_dependence( earth_result, variable, n_grid = 50L, response_idx = NULL )
earth_result |
An object of class |
variable |
Character string. Name of the predictor variable to plot. |
n_grid |
Integer. Number of grid points for the partial dependence calculation. Default is 50. |
response_idx |
Integer or |
A ggplot2::ggplot object.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_partial_dependence(result, "wt")result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_partial_dependence(result, "wt")
Creates a normal Q-Q plot of the model residuals.
plot_qq(earth_result, response_idx = NULL)plot_qq(earth_result, response_idx = NULL)
earth_result |
An object of class |
response_idx |
Integer or |
A ggplot2::ggplot object.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_qq(result)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_qq(result)
Creates a two-panel diagnostic plot: residuals vs fitted values and a Q-Q plot of residuals.
plot_residuals(earth_result, response_idx = NULL)plot_residuals(earth_result, response_idx = NULL)
earth_result |
An object of class |
response_idx |
Integer or |
A ggplot2::ggplot object showing residuals vs fitted values.
Use plot_qq() for the Q-Q plot separately.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_residuals(result)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_residuals(result)
Creates a horizontal bar chart of variable importance from a fitted earth model.
plot_variable_importance(earth_result, type = "nsubsets")plot_variable_importance(earth_result, type = "nsubsets")
earth_result |
An object of class |
type |
Character. Importance metric to plot: |
A ggplot2::ggplot object.
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_variable_importance(result)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) plot_variable_importance(result)
Pre-generates all plots and data for the earth model report. Returns the
path to a directory containing all assets. This directory can be passed to
render_report() to avoid re-computing anything during rendering.
prepare_report_assets(earth_result, assets_dir = NULL)prepare_report_assets(earth_result, assets_dir = NULL)
earth_result |
An object of class |
assets_dir |
Character. Path to write assets. If |
The path to the assets directory (invisibly).
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) assets <- prepare_report_assets(result)result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) assets <- prepare_report_assets(result)
Joins country, admin levels, and project name with _ to produce the
single hierarchy-encoding folder name used at the project level under
<root>/<purpose>/. Validates each component: admin codes must match
^[a-z0-9-]+$ (no internal underscores), project name allows
^[A-Za-z0-9_-]+$.
regproj_flat_segment(country, levels, project_name)regproj_flat_segment(country, levels, project_name)
country |
Lowercase ISO 3166-1 alpha-2 country code (e.g. |
levels |
Character vector of admin codes, ordered top to bottom.
Length must match |
project_name |
Project leaf name. Must satisfy |
Character scalar (e.g. "us_ca_081_burlin_20251231_j").
Returns a DBI connection. Creates the schema on first call. The
caller is responsible for DBI::dbDisconnect().
regproj_geo_db_connect(root = default_regproj_root())regproj_geo_db_connect(root = default_regproj_root())
root |
regProj root. Defaults to |
On first creation, the tables are seeded from:
the shipped reference data (regproj_reference()) — 24 countries,
51 US states, 3,076 US counties.
the shipped places data (pkg/inst/extdata/regproj_geo.rds) — US
incorporated places + GeoNames-derived city/admin data for GB, DE,
IT, FR, SE, SG.
any pre-existing <REGPROJ_ROOT>/.regproj-index.json (legacy
migration).
A DBIConnection to the SQLite database.
<REGPROJ_ROOT>/geo.sqlite — holds country codes and a flexible,
variable-depth admin_entries table for state / county / city /
deeper-level admin codes per country. Travels with the regProj tree.
regproj_geo_db_path(root = default_regproj_root())regproj_geo_db_path(root = default_regproj_root())
root |
regProj root. Defaults to |
Character scalar.
Returns the basenames of regular files in <project_path>/<os>_in/.
Hidden dotfiles and subdirectories are excluded. Sorted alphabetically.
regproj_in_files(project_path, os = os_detect())regproj_in_files(project_path, os = os_detect())
project_path |
Absolute path to the project root (the flat segment folder). |
os |
OS segment to use. Defaults to |
Character vector of file basenames.
Looks up full_name under scope (e.g. "us/ca") in the geo
SQLite DB (which is seeded with shipped reference data on first
creation). Returns NULL if not found.
regproj_index_get(scope, full_name, root = default_regproj_root())regproj_index_get(scope, full_name, root = default_regproj_root())
scope |
Character scalar. Slash-separated path of parent codes
(e.g. |
full_name |
Character scalar. The display name to look up. |
root |
regProj root. Defaults to |
Character scalar code, or NULL.
The geo data is now stored in <REGPROJ_ROOT>/geo.sqlite (see
regproj_geo_db_path()). This path returns the location of the
legacy .regproj-index.json file, which is migrated into the
SQLite DB on first connect and is no longer written to.
regproj_index_path(root = default_regproj_root())regproj_index_path(root = default_regproj_root())
root |
regProj root. Defaults to |
Character scalar.
Inserts (or replaces) the mapping under the given scope.
regproj_index_put(scope, full_name, code, root = default_regproj_root())regproj_index_put(scope, full_name, code, root = default_regproj_root())
scope |
Character scalar. Slash-separated path of parent codes
(e.g. |
full_name |
Character scalar. The display name to look up. |
code |
Character scalar. The path code to assign. |
root |
regProj root. Defaults to |
Invisibly, the code.
Returns the entire geo DB as a nested list keyed by scope (slash-
separated parent-code path), for backward compatibility with callers
that iterate. New code should prefer regproj_index_get() or direct
DB queries via regproj_geo_db_connect().
regproj_index_read(root = default_regproj_root())regproj_index_read(root = default_regproj_root())
root |
regProj root. Defaults to |
Named list. Outer key is scope ("" for countries, then
slash-separated codes per admin level). Inner key is full name;
value is the code.
Deprecated. The geo data is now in SQLite; this is a no-op kept for backward compatibility.
regproj_index_write(idx, root = default_regproj_root())regproj_index_write(idx, root = default_regproj_root())
idx |
Ignored. |
root |
regProj root. Defaults to |
Invisibly, the geo DB path.
Stores the basename of the most-recently-selected input file in
<project>/<os>/.regproj-last, so it can be auto-selected the next
time the project is opened.
regproj_last_file_path(project_path, os = os_detect()) regproj_last_file_get(project_path, os = os_detect()) regproj_last_file_set(project_path, basename, os = os_detect())regproj_last_file_path(project_path, os = os_detect()) regproj_last_file_get(project_path, os = os_detect()) regproj_last_file_set(project_path, basename, os = os_detect())
project_path |
Absolute project path. |
os |
OS segment. Defaults to |
basename |
File basename to remember. |
Path / basename / invisible path.
Walks <root>/<purpose>/<country>/<state>/<county>/<city>/<project_name>/
and returns a data frame describing each project found, with its mtime
(most recent of the in/ and out/ trees, falling back to the project
folder itself).
regproj_list_projects( root = default_regproj_root(), sort_by = c("recent", "alpha") )regproj_list_projects( root = default_regproj_root(), sort_by = c("recent", "alpha") )
root |
regProj root. Defaults to |
sort_by |
|
Recognized purposes: gen, appr, mktarea. Other top-level
subdirectories under <root> are ignored, including any leading-dot
files like .regproj-index.json.
Data frame with columns: project_path (absolute), purpose,
country, state, county, city, project_name, mtime
(POSIXct). Empty data frame (correct columns, zero rows) if no
projects exist or root is missing.
Inverse of regproj_flat_segment(). Splits on _, takes the first
token as country, then the next length(country_schema(country))
tokens as admin levels; the remaining tokens (joined by _) are the
project name. Returns NULL on parse failure.
regproj_parse_flat(segment)regproj_parse_flat(segment)
segment |
Character scalar. |
Named list with country, levels (character vector),
project_name — or NULL if parsing failed.
Composes the path
<root>/<purpose>/<flat_segment>/<os>_<in|out>[_<method>] from its
components. The hierarchy (country / admin levels / project name) is
concatenated into a single folder under <purpose>/ to keep the tree
shallow. Pure path computation — does not create any directories
unless create = TRUE.
regproj_path( purpose, country, levels, project_name, os = os_detect(), in_or_out = c("out", "in"), method = "earth", root = default_regproj_root(), create = FALSE )regproj_path( purpose, country, levels, project_name, os = os_detect(), in_or_out = c("out", "in"), method = "earth", root = default_regproj_root(), create = FALSE )
purpose |
|
country |
Lowercase ISO 3166-1 alpha-2 country code (e.g. |
levels |
Character vector of admin codes, ordered top to bottom.
Length must match |
project_name |
Project leaf name. Must satisfy |
os |
One of |
in_or_out |
One of |
method |
Optional method subdir (e.g. |
root |
Optional explicit regProj root. Defaults to
|
create |
Logical. If |
Character scalar. Absolute normalized path.
Returns a DBI connection. Creates the schema on first call. The
caller is responsible for DBI::dbDisconnect().
regproj_projects_db_connect(root = default_regproj_root())regproj_projects_db_connect(root = default_regproj_root())
root |
regProj root. Defaults to |
A DBIConnection to the SQLite database.
<REGPROJ_ROOT>/projects.sqlite — one row per project, keyed by the
flat segment (e.g. "us_ca_081_burlin_20251231_j"). Holds project
metadata and per-method settings as JSON blobs.
regproj_projects_db_path(root = default_regproj_root())regproj_projects_db_path(root = default_regproj_root())
root |
regProj root. Defaults to |
Character scalar.
Reads pkg/inst/extdata/regproj_reference.json once per session and
caches the result. Contains country names, US states, and US counties
(with FIPS codes). Used by the UI cascades to populate dropdowns.
regproj_reference()regproj_reference()
A nested list with components version, countries, states,
counties.
Renders a parameterized 'Quarto' report from the fitted 'earth' model results. Requires the 'quarto' R package and a 'Quarto' installation.
render_report( earth_result, output_format = "html", output_file = NULL, paper_size = "letter", assets_dir = NULL )render_report( earth_result, output_format = "html", output_file = NULL, paper_size = "letter", assets_dir = NULL )
earth_result |
An object of class |
output_format |
Character. Output format: |
output_file |
Character. Path for the output file. If |
paper_size |
Character. Paper size for PDF output: |
assets_dir |
Character or |
The path to the rendered output file (invisibly).
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) render_report(result, output_format = "html", output_file = tempfile(fileext = ".html"))result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt")) render_report(result, output_format = "html", output_file = tempfile(fileext = ".html"))
Given an RCA-adjusted data frame (subject in row 1, comps in rows 2+), builds comp-summary tables, filters by weight, sorts by gross adjustment percentage, splits into "recommended" (gross adjustment < threshold) and "others", and caps the recommended list.
select_sales_grid_comps( rca_df, sale_age_col = "sale_age", min_weight = 0, max_gross_adj_pct = 0.25, max_recommended = 30L )select_sales_grid_comps( rca_df, sale_age_col = "sale_age", min_weight = 0, max_gross_adj_pct = 0.25, max_recommended = 30L )
rca_df |
A data frame produced by the RCA workflow. Row 1 is the
subject; rows 2+ are comps. Expected columns (any may be missing; NA is
substituted): |
sale_age_col |
Character scalar giving the column name that holds
sale age in days. Defaults to |
min_weight |
Numeric. Comps with |
max_gross_adj_pct |
Numeric. Comps whose |
max_recommended |
Integer. Upper bound on the number of recommended
comps returned. Default |
This is the non-Shiny computation kernel used by the earthUI Shiny app's Sales Grid modal, and is also suitable for use from batch scripts.
A named list with:
Data frame of recommended comps, sorted by
sale_age ascending, capped at max_recommended rows.
Data frame of eligible comps not in the recommended
set, sorted by gross_adj_pct ascending.
Each data frame has columns: row, id, address, sale_price,
sale_age, weight, gross_adj, gross_adj_pct.
Writes model settings for a project to <REGPROJ_ROOT>/projects.sqlite.
Used by the Shiny UI to persist settings, and by external tools to seed
projects programmatically. Settings are scoped by project + purpose and
shared across all data files in the project.
set_project_settings( project_path, settings = NULL, variables = NULL, interactions = NULL, method = "earth", purpose = "general", root = default_regproj_root() )set_project_settings( project_path, settings = NULL, variables = NULL, interactions = NULL, method = "earth", purpose = "general", root = default_regproj_root() )
project_path |
Absolute path to the project root. |
settings, variables, interactions
|
JSON strings (or |
method |
One of |
purpose |
Settings scope: one of |
root |
regProj root. Defaults to |
Invisibly, NULL.
Checks each selected predictor's actual data against the user-declared type. Returns a list of errors (blocking), warnings (non-blocking), and any Date/POSIXct columns that will be auto-converted to numeric.
validate_types(df, type_map, predictors)validate_types(df, type_map, predictors)
df |
A data frame. |
type_map |
Named list or character vector. Names are column names,
values are declared types (e.g., |
predictors |
Character vector of selected predictor column names. |
A list with components:
Logical. TRUE if no blocking errors found.
Character vector of non-blocking warnings.
Character vector of blocking errors.
Character vector of Date/POSIXct predictor columns that will be auto-converted to numeric.
df <- data.frame(price = c(100, 200, 300), city = c("A", "B", "C")) types <- list(price = "numeric", city = "character") validate_types(df, types, predictors = c("price", "city"))df <- data.frame(price = c(100, 200, 300), city = c("A", "B", "C")) types <- list(price = "numeric", city = "character") validate_types(df, types, predictors = c("price", "city"))
Writes the model print-out, summary.earth(), and optional variance
model / trace log to
<file_name>_earth_output_<timestamp>.txt in output_folder.
write_earth_output(result, output_folder, file_name)write_earth_output(result, output_folder, file_name)
result |
A fit result list as returned by |
output_folder |
Character scalar. May be |
file_name |
Character scalar. Used to derive the output filename. |
Invisibly, NULL.
Writes the timestamped contents of an earth fitting trace to
<file_name>_earth_log_<timestamp>.txt in output_folder.
If output_folder is NULL or empty, ~/Downloads is used. The folder
is created if it doesn't exist. Errors are caught and reported via
message() so batch pipelines don't fail on logging issues.
write_fit_log(output_folder, lines, file_name)write_fit_log(output_folder, lines, file_name)
output_folder |
Character scalar. Directory in which to write the
log. May be |
lines |
Character vector of log lines. |
file_name |
Character scalar. Used to derive the log filename (the extension is stripped). |
Invisibly, NULL.