Package 'ham'

Title:	Healthcare Analysis Methods
Description:	Conducts analyses for healthcare program evaluations or intervention studies. Calculates regression analyses for standard ordinary least squares (OLS or linear) or logistic models. Performs regression models used for causal modeling such as differences-in-differences (DID) and interrupted time series (ITS) models. Provides limited interpretations of model results and a ranking of variable importance in models. Performs propensity score models, top-coding of model outcome variables, and can return new data with the newly formed variables. Conducts Bayesian analysis summaries and graphs, decision curve analysis, and produces some Shewhart control charts. Also performs Cronbach's alpha for various scale items (e.g., survey questions). See Github URL for examples in the README file. For more details on the statistical methods, see Allen & Yen (1979, ISBN:0-8185-0283-5), Angrist & Pischke (2009, ISBN:9780691120355), Cohen (1988, ISBN:0-8058-0283-5), Gebski (2012) <doi:10.1017/S0950268812000179>, Gelman & Goodrich (2019) <doi:10.1080/00031305.2018.1549100>, Harrell (2016, ISBN:978-3-319-19424-0), Kline (1999, ISBN:9780415211581), Kruschke (2014, ISBN:9780124058880), Linden (2015) <doi:10.1177/1536867X1501500208>, Merlo (2006) <doi:10.1136/jech.2004.029454>, Muthen & Satorra (1995) <doi:10.2307/271070>, Rabe-Hesketh & Skrondal (2008, ISBN:978-1-59718-040-5), Ryan (2011, ISBN:978-0-470-59074-4), and Vickers & Elkin (2006) <doi:10.1177/0272989X06295361>.
Authors:	Stephen Zuniga [aut, cre, cph] (ORCID: <https://orcid.org/0000-0003-1458-3924>)
Maintainer:	Stephen Zuniga <[email protected]>
License:	MIT + file LICENSE
Version:	1.2.0.9000
Built:	2026-06-09 07:48:12 UTC
Source:	https://github.com/szuniga07/ham

Help Index

Calculates Cronbach's alpha on scale items
Assess models with regression
Summarize Bayesian Markov Chain Monte Carlo (MCMC) object
Patient survey data
Markov Chain Monte Carlo linear regression estimates of plant's CO2 uptake regressed on ambient carbon dioxide concentrations
Bayes class object with summarized Markov Chain Monte Carlo estimates of plant's CO2 uptake as a binary variable that is above or below the median level
Statistics for Shewhart control charts
Statistics for Decision Curve Analysis and logistic regression model classification
Group level confidence intervals and between-group variation
Patient hospital program/intervention data, intervention group only
Patient hospital program/intervention data
Importance of variables based on partial chi-square statistic
Hospital acquired infections (HAI) during 41 months.
Interpret model output
Interrupted time series analysis effects
Markov Chain Monte Carlo estimates of hospital length of stay from the hosprog data frame
Prediction plot of treatment and control groups for DID and ITS models
Bayesian plots for various analyses
Shewhart control charts
Decision Curve Analysis plots and regression model classification graphs on sensitivity and specificity
Confidence interval graphs for group class objects
Plot of variable importance ranked by partial chi-square statistic
Print alpha results
Print interpret object
USA's unemployment rate between 1929 and 2024

Calculates Cronbach's alpha on scale items

Description

Performs Cronbach's alpha of specified items from a data frame. Cronbach's Alpha is a formula for estimating the internal consistency reliability of a measurement instrument such as survey items (see Allen & Yang, 1979; Kline, 1999). Survey items can have 2 or more categories such as 5-point scales and contain 2 or more items.

Usage

alpha(items, data)
alpha(items, data)

Arguments

items

Vector of item names that form a scale (e.g., 5-point Likert scales)

data

Data frame object.

Value

A list object with Cronbach's alpha summary statistics.

References

Allen, M. J., & Yen, W. M. (1979). Introduction to Measurement Theory. Brooks/Cole. ISBN: 0-8185-0283-5. Kline, Paul (1999). Handbook of Psychological Testing (2nd ed). Routledge, New York. ISBN: 9780415211581.

Examples

alpha(items=c("i1","i2","i3","i4","i5"), data=cas)

# remove i1 as suggested in the previous example, returns higher alpha
alpha(items=c("i2","i3","i4","i5"), data=cas)
alpha(items=c("i1","i2","i3","i4","i5"), data=cas)

# remove i1 as suggested in the previous example, returns higher alpha
alpha(items=c("i2","i3","i4","i5"), data=cas)

Assess models with regression

Description

Fit ordinary least squares (OLS) and logistic models. And fit models for causal inference such as differences-in-differences and interrupted time series. Run these models to evaluate program performance or test intervention effects (e.g., healthcare programs). Options are available for top coding the outcome variable as well as propensity scores. New data can optionally be returned that has these additional variables and constructed variables that are used for DID and ITS models.

Usage

assess(
  formula,
  data,
  regression = "none",
  did = "none",
  its = "none",
  intervention = NULL,
  int.time = NULL,
  treatment = NULL,
  interrupt = NULL,
  subset = NULL,
  stagger = NULL,
  topcode = NULL,
  propensity = NULL,
  trim = NULL,
  weights = NULL,
  offset = NULL,
  newdata = FALSE,
  family = NULL
)
assess(
  formula,
  data,
  regression = "none",
  did = "none",
  its = "none",
  intervention = NULL,
  int.time = NULL,
  treatment = NULL,
  interrupt = NULL,
  subset = NULL,
  stagger = NULL,
  topcode = NULL,
  propensity = NULL,
  trim = NULL,
  weights = NULL,
  offset = NULL,
  newdata = FALSE,
  family = NULL
)

Arguments

formula

a formula object. Use 'Y ~ .' in DID and ITS models to only specify the constructed model variables (e.g., right side of the DID model: Y ~ Post.All + Int.Var + DID). If regression=ols or regression=logistic, 'Y ~ .' will use all variables in the data.frame as is standard in formulas.

data

a data.frame in which to interpret the variables named in the formula.

regression

Select a regression method for standard regression models (i.e., neither DID nor ITS). Options are regression="ols" (ordinary least squares AKA linear), regression="logistic", or regression='Poisson'. Default is regression="none" for no standard regression model.

did

option for Differences-in-Differences (DID) regression. Select did="two" for models with only 2 time points (e.g., pre/post-test). Select did="many" for >= 3 time points (e.g., monthly time points in 12 months of data). Default is did="none" for no DID.

its

option for Interrupted Time Series (ITS) regression. Select its="one" for one group (e.g., intervention only). Select its="two" for two groups (intervention and control). Default is did="none" for no ITS.

intervention

optional intervention variable name selected for DID, ITS, and propensity score models that indicate which cases are in the intervention or not when propensity is a character class (i.e., intervention can be NULL when propensity is a formula class for a non-DID or non-ITS model).

int.time

optional intervention time variable name selected for DID or ITS models. This indicates the duration of time relative to when the intervention started.

treatment

optional treatment start period variable name selected for DID models. Select 1 value from 'int.time' to indicate the start of the intervention.

interrupt

optional interruption (or intervention) period(s) variable name selected for ITS models. Select 1 or more values from 'int.time' to indicate the start and/or key intervention periods. There needs to be at least 2 time points per period, at least 3 is better. For example, interrupt= c(3, 5, 7) will suffice, especially if you want to isolate certain periods but interrupt= c(3, 6, 9) may provide more useful information.

subset

an expression defining a subset of the observations to use in the regression model. The default is NULL, thereby using all observations. Specify, for example, data$hospital == "NY" or c(1:100,200:300) respectively to use just those observations. This is helpful when doing a submodel for DID or ITS after identifying similar groups. DID and ITS models could be improved by limiting the choice of control groups to only those with similar values on the intervention indicator and baseline trend variable (e.g., 'ITS.Time' and 'ITS.Int') with p-values >= 0.10.

stagger

optional list to indicate staggered entry into the intervention or treatment group. Relevant model variables are re-coded to appropriate values and can be used for a form of 'stacked' DID or ITS. If a group of cases joins X months after the primary sample, model variables are adjusted X months. This three element list named: 'a' = a character vector for the name of the grouping column; 'b' = specific categories or levels that indicate which cases have a staggered entry; and 'c' = the time point values at staggered entry. Both 'b' and 'c' must have identical lengths. For ITS models, the staggered entry time must be: interrupt 1 < stagger time < interrupt 2. For example, a WHO health policy may have started in the 3rd year of the study period in NY and Toronto but Chicago and LA joined 6 and 12 months later, therefore stagger= list(a= 'city', b=c('Chicago', 'LA')), c=(30, 36) while interrupt= 25. Default is NULL.

topcode

optional value selected to top code Y (or left-hand side) of the formula. Analyses will be performed using the new top coded variable.

propensity

optional character vector of variable names or formula class to perform a propensity score model. This requires the 'intervention' option to be selected only when using a character vector. All models will include 'pscore' (propensity score) in the analysis as a covariate adjustment using the propensity score unless one of the weights options 'ipw' (Inverse Probability Weights for the ATE, Average Treatment Effect, estimator), 'nipw' (Normalized Inverse Probability Weights by Hajek estimator), or 'att' (Average Treatment Effect on the Treated) is selected for the analysis.

trim

an optional two-element numeric vector that sets limits between 0 and 1 for the propensity score. If NULL, the default values of >= 0 and <= 1 are used (i.e., c(0,1)).

weights

an optional 1-element character vector of the data frame column name or character vector of either 'ipw', 'nipw', or 'att' for Inverse Probability of Treatment Weighting Using the propensity score (see 'propensity' above) to be used in the fitting process. Should be NULL or a character vector. If non-NULL, weighting is used with weights; otherwise standard regression is used.

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. One or more offset terms can be included in the formula instead or as well, and if more than one is specified their sum is used.

newdata

optional logical value that indicates if you want the new data returned. newdata=TRUE will return the data with any new columns created from the DID, ITS, propensity score, or top coding. The default is newdata=FALSE. No new data will be returned if none was created.

family

a description of the error distribution and link function to be used in the model. For glm this can be a character string naming a family function, a family function or the result of a call to a family function. For glm.fit only the third option is supported. See 'glm' function in Base R for more details. Currently available when regression= 'logistic' or 'Poisson'.

Value

a list of results from selected regression models. Will return new data if selected. And returns relevant model information such as variable names, type of analysis, formula, study information, and summary of ITS effects if analyzed.

References

Angrist, J. D., & Pischke, J. S. (2009). Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press. ISBN: 9780691120355.

Gebski, V., et al. (2012). Modelling Interrupted Time Series to Evaluate Prevention and Control of Infection in Healthcare. Epidemiology & Infections, 140, 2131–2141. https://doi.org/10.1017/S0950268812000179

Linden, A. (2015). Conducting Interrupted Time-series Analysis for Single- and Multiple-group Comparisons. The Stata Journal, 15, 2, 480-500. https://doi.org/10.1177/1536867X1501500208

Examples

# ordinary least squares R^2
summary(assess(hp ~ mpg+wt, data=mtcars, regression="ols")$model)

# logistic
summary(assess(formula=vs~mpg+wt+hp, data=mtcars, regression="logistic")$model)

# OLS with a propensity score
summary(assess(formula=los ~ month+program, data=hosprog, intervention = "program",
regression="ols", propensity=c("female","age","risk"))$model)

# OLS: top coding los at 8.27 and propensity score means (top.los and pscore)
summary(assess(formula=los ~ month+program, data=hosprog, intervention = "program",
regression="ols", topcode=8.27, propensity=c("female","age","risk"),
newdata=TRUE)$newdata[, c("los", "top.los", "pscore")])

# differences-in-differences model: using 2 time periods, pre- and post-intervention
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two")$DID)

# DID model: using time points
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="many")$DID)

#interrupted time series model: two groups and 1 interruption (interrupt= 5)
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)$ITS)

#interrupted time series model: two groups and 2 interruptions (interrupt= c(5,9))
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = c(5,9))$ITS)

# ordinary least squares R^2
summary(assess(hp ~ mpg+wt, data=mtcars, regression="ols")$model)

# logistic
summary(assess(formula=vs~mpg+wt+hp, data=mtcars, regression="logistic")$model)

# OLS with a propensity score
summary(assess(formula=los ~ month+program, data=hosprog, intervention = "program",
regression="ols", propensity=c("female","age","risk"))$model)

# OLS: top coding los at 8.27 and propensity score means (top.los and pscore)
summary(assess(formula=los ~ month+program, data=hosprog, intervention = "program",
regression="ols", topcode=8.27, propensity=c("female","age","risk"),
newdata=TRUE)$newdata[, c("los", "top.los", "pscore")])

# differences-in-differences model: using 2 time periods, pre- and post-intervention
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two")$DID)

# DID model: using time points
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="many")$DID)

#interrupted time series model: two groups and 1 interruption (interrupt= 5)
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)$ITS)

#interrupted time series model: two groups and 2 interruptions (interrupt= c(5,9))
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = c(5,9))$ITS)

Summarize Bayesian Markov Chain Monte Carlo (MCMC) object

Description

Convert a list of Bayesian analysis chains (e.g., coda package mcmc.list objects) into a data frame for analysis and creating plots. For example, models from JAGS or Stan that were converted into coda class objects can be used to create the data frames. Calculates a set of descriptive statistics that summarize MCMC parameters. And values can be calculated for use in descriptive graphs such as values associated with specific percentiles and vice-versa to help set targets, summaries on hierarchical or multilevel models (up to 3 levels), and the R2 for Bayesian regression models with metric level predictors.

Usage

Bayes(
  x,
  y = "mcmc",
  parameter = NULL,
  mass = 0.95,
  compare = NULL,
  rope = NULL,
  newdata = FALSE,
  type = NULL,
  center = "mode",
  data = NULL,
  dv = NULL,
  iv = NULL,
  expand = NULL,
  targets = NULL
)
Bayes(
  x,
  y = "mcmc",
  parameter = NULL,
  mass = 0.95,
  compare = NULL,
  rope = NULL,
  newdata = FALSE,
  type = NULL,
  center = "mode",
  data = NULL,
  dv = NULL,
  iv = NULL,
  expand = NULL,
  targets = NULL
)

Arguments

x

list object of multiple MCMC chains (e.g., matrix class list elements or coda mcmc.list).

y

character vector for the type of analysis or output to perform. Select 'post', 'multi', 'target', 'r2', or 'mcmc' for a posterior summary, multilevel/hierarchical model summary (up to 3 levels), target summary, Gelman R-squared statistic, or list object of MCMC chains converted into a data frame. Default is generic 'mcmc'(no analysis, just MCMC creation).

parameter

single or multiple element character vector name of parameter(s) in MCMC chains to produce summary statistics. When y='target', use the generally 2 to 3 parameters that represent the distribution parameters (e.g., parameter= c('mean', 'sd')). When y='r2', use the regression parameters in order, ending with the residual or level-1 variance (e.g., parameter= c('intercept', 'beta1', 'beta2', 'standard_deviation')). Default is NULL.

mass

numeric vector that specifies the credible mass used in the Highest Density Interval (HDI). Default is 0.95.

compare

numeric vector with one comparison value to determine how much of the distribution is above or below the comparison value. Default is NULL.

rope

numeric vector with two values that define the Region of Practical Equivalence (ROPE). Test hypotheses by setting low and high values to determine if the Highest Density Interval (HDI) is within or outside of the ROPE. Parameter value declared not credible if the entire ROPE lies outside the HDI of the parameter’s posterior (i.e., we reject the null hypothesis). For example, the ROPE of a coin is set to 0.45 to 0.55 but the posterior 95% HDI is 0.61 - 0.69 so we reject the null hypothesis value of 0.50. We can accept the null hypothesis if the entire 95% HDI falls with the ROPE. Default is NULL.

newdata

optional logical vector that indicates if you want the new MCMC data returned. When newdata=TRUE, it will return the list object of MCMC chains, converted into a data frame. This data is used for analysis and all plots. Please select newdata=TRUE to produce any graphs but not needed when y='multi'. The default is newdata=FALSE.

type

character vector of length == 1 that indicates the likelihood function used in the model when y='multi' or y='target'. Select 'n', 'ln', 'w', 'g', 't', 'bern', and 'bin' for these respective options in Bayesian estimation (multilevel): 'Normal', 'Log-normal', 'Weibull', 'Gamma', 't', 'Bernoulli', or 'binomial'. Default is NULL.

center

character vector that selects the type of central tendency to use when reporting parameter values when y='post', y='target', or y='r2'. Choices include: 'mean', 'median', and 'mode' when y='post', or 'mean' and 'median' when y='r2'. Default is 'mode' when y='post' or 'target' and 'median' when y='r2'.

data

object name for the observed data when y='multi' or y='r2'. Default is NULL.

dv

character vector of length == 1 for the dependent variable name in the observed data frame when y='multi'. Default is NULL.

iv

character vector of length >= 1 for the independent variable name(s) in the observed data frame when y='multi' or y='r2'. When y='multi', enter the lower to higher level clustering or group names (e.g, for health data, iv=c("patient", "hospital"). When type='taov', enter the name of the test group variable. When y='r2', enter the observed data variable names for the hierarchical or multilevel groups. Default is NULL.

expand

a character vector of length == 1 indicating the variable name to expand aggregated data into non-aggregated data frames when y='multi'. This variable is the denominator that can be used to calculate a rate in the formula numerator/denominator. For example, when the 'numerator' column equals 4 and the 'denominator' column equals 10, then this single row of data is expanded to 10 rows with four values of 1 and six values of 0 when expand='denominator'. Default is NULL.

targets

list of one or two named elements (p, y) with numeric values that represent quantile values (p) in the distribution to return associated outcome values and/or specific outcome values (y) to retrieve associated probabilities. For example, a distribution of harmful hospital readmission rates has an estimated median value of 0.25. Staff are considering 2 types of targets, percentiles (p) of key interest and specific outcome rates (y). They want to know the readmission rate that is at the 40th percentile for a reduced readmission rate (below what is 'average' at the 50th percentile) and the probability greater than a readmission rate of 0.20. They get this information by entering targets=list(p=0.40, y=0.20); calculating 1 - prob(y) from returned results gives them an idea about the effort needed to meet this target of a reduced readmission rate. Select type= one of these options: 'n', 'ln', 'w', 'g', 't', 'bern', 'bin'. Also select parameter= the appropriate center, spread, and possible 3rd shape distribution parameter (e.g., parameter=c('mean', 'sd')). And option to select center= 'mean', 'median', 'mode'. Default is NULL.

Value

data frame of summary statistics for the MCMC parameter's distribution and/or MCMC data frame. Statistics include highest density interval, effective sample size, proportion of distribution within and outside of a ROPE, distribution compared with a set value, and the parameter's mean, median, and mode. And distribution summaries for multilevel models, target summaries, and regression model R2.

References

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, Second Edition. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers. ISBN 0-8058-0283-5

Gelman, A., Goodrich, B., Gabry, J., & Vehtari, A. (2019). R-squared for Bayesian Regression Models. The American Statistician, 73, 3, 307–309. https://doi.org/10.1080/00031305.2018.1549100

Kruschke, J. (2014). Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan, Second Edition. New York: Academic Press. ISBN: 9780124058880

Examples

# Posterior estimates of length of stay (LOS) #
# What is the 'average' LOS and are we beyond our goal of it being <= 3 days?
blos1 <- Bayes(losmcmc, y="post", parameter="muOfY", newdata=TRUE,
compare=4.5, rope=c(1,3))   #Where are we in relation to 4.5 days?
print(blos1$Posterior.Summary) #looks like LOS is substantially higher than 3 days

# Multilevel or hierarchical model summaries #
# Code below does not run, no 'mcmc_sample' object
# bmulti0 <- Bayes(x=mcmc_sample, parameter=c("theta", "omega","omegaO"),
# y="multi", type="bern", data=mydf, dv="upbin", iv= c("Plant", "Group"))

# Targets for length of stay (LOS) #
# Our administrators ask how far are we from our goals, they ask about targets
# in increments of 5 points of probability or specific days. We answer both.
btarget1 <- Bayes(x=losmcmc, y="target", type="n", parameter=c("muOfY","sigmaOfY"),
newdata=TRUE, target=list(p=c(.35,.4,.45, .5, .55),  y=c(3,4))) # 'newdata' for plots
print(btarget1$Target$Est.Quantile.P)  # 5-point increments
print(btarget1$Target$Est.Prob.GT.Y)   # specific day values, 4 days is plausible

# Gelamn's R^2 #
# the regression model using Base R data, CO2: update ~ conc
bR2 <- Bayes(x=co2mcmc, y='r2', data=CO2, iv="uptake",
parameter=c("b0", "b1", "sigma"), newdata=TRUE)
# bR2 returns various information
print(bR2$R2.Summary$R2)                   # R^2
print(bR2$R2.Summary$Variance.Pred.Y)      # Variance of predicted outcome
print(bR2$R2.Summary$Variance.Residuals)   # Variance of residuals
print(head(bR2$R2.Summary$yPRED))          # Predicted outcomes
# Posterior estimates of length of stay (LOS) #
# What is the 'average' LOS and are we beyond our goal of it being <= 3 days?
blos1 <- Bayes(losmcmc, y="post", parameter="muOfY", newdata=TRUE,
compare=4.5, rope=c(1,3))   #Where are we in relation to 4.5 days?
print(blos1$Posterior.Summary) #looks like LOS is substantially higher than 3 days

# Multilevel or hierarchical model summaries #
# Code below does not run, no 'mcmc_sample' object
# bmulti0 <- Bayes(x=mcmc_sample, parameter=c("theta", "omega","omegaO"),
# y="multi", type="bern", data=mydf, dv="upbin", iv= c("Plant", "Group"))

# Targets for length of stay (LOS) #
# Our administrators ask how far are we from our goals, they ask about targets
# in increments of 5 points of probability or specific days. We answer both.
btarget1 <- Bayes(x=losmcmc, y="target", type="n", parameter=c("muOfY","sigmaOfY"),
newdata=TRUE, target=list(p=c(.35,.4,.45, .5, .55),  y=c(3,4))) # 'newdata' for plots
print(btarget1$Target$Est.Quantile.P)  # 5-point increments
print(btarget1$Target$Est.Prob.GT.Y)   # specific day values, 4 days is plausible

# Gelamn's R^2 #
# the regression model using Base R data, CO2: update ~ conc
bR2 <- Bayes(x=co2mcmc, y='r2', data=CO2, iv="uptake",
parameter=c("b0", "b1", "sigma"), newdata=TRUE)
# bR2 returns various information
print(bR2$R2.Summary$R2)                   # R^2
print(bR2$R2.Summary$Variance.Pred.Y)      # Variance of predicted outcome
print(bR2$R2.Summary$Variance.Residuals)   # Variance of residuals
print(head(bR2$R2.Summary$yPRED))          # Predicted outcomes

Patient survey data

Description

Artificial data of a 5 item hospital satisfaction survey for a Cronbach's alpha scale (cas).

Usage

cas
cas

Format

`cas`

An artificial data frame with 100 rows and 5 columns:

i1 - i5: 5 survey items

...

Source

Artificial dataset created with rbinom for 5 items. For example, rbinom(100, 5, .9) generates 1 item. The prob argument is modified to give more or less consistent ratings per item.

Markov Chain Monte Carlo linear regression estimates of plant's CO2 uptake regressed on ambient carbon dioxide concentrations

Description

Markov Chain Monte Carlo linear regression estimates of plant's CO2 uptake regressed on ambient carbon dioxide concentrations

Usage

co2mcmc
co2mcmc

Format

`co2mcmc`

MCMC list of 3 chains with 2000 rows and 4 columns in each of the list elements:

b0: Regression intercept.
b1: Regression coefficient of the carbon dioxide concentration effect.
sigma: Square root of model residual variance.
lp__: Total log unnormalized joint posterior density of the model parameters.

...

Source

co2mcmc is a list of 3 MCMC simulations. It estimates a linear regression using the CO2 data frame from base R. A regression model was specified using a normal distribution maximum likelihood and normal and uniform distribution priors for the beta coefficients and sigma parameters.

Bayes class object with summarized Markov Chain Monte Carlo estimates of plant's CO2 uptake as a binary variable that is above or below the median level

Description

Bayes class object with summarized Markov Chain Monte Carlo estimates of plant's CO2 uptake as a binary variable that is above or below the median level

Usage

co2multi
co2multi

Format

`co2multi`

Bayes class list of 7 elements with varying rows and columns in each of the list elements:

Posterior.Summary: Standard element that is NA in this object because it is not applicable to this analysis.
MCMC: Dataframe of 4 combined MCMC chain simulations with 8000 rows and 18 columns. Column 1 is CHAIN indicates the 4 chains, columns 2-13 are 'theta' and indicates each plant's rate of having above the median uptake level, columns 14-17 are 'omega' group rates of 'Type' by 'Treatment', and column 18 is the overall group rate.
Multilevel: Posterior estimate summaries in alphabetical order of plant or group names and numerical order plant or group rates.
Target: Standard element that is NA in this object because it is not applicable to this analysis.
targets: Standard element that is NA in this object because it is not applicable to this analysis.
R2.Summary: Standard element that is NA in this object because it is not applicable to this analysis.
parameter: 3 parameter names in the multilevel analysis, 'theta', 'omega', and 'omegaO'.

...

Source

co2multi is a list of summarized Markov Chain Monte Carlo estimates of plant's CO2 uptake as a binary variable that is above or below the median level. It summarizes measurements nested within plants nested within the 4 levels from a combination of 'Treatment' by 'Type'. An estimation used the CO2 data frame from base R. A hierarchical estimation model was specified using a gamma distribution maximum likelihood and normal and uniform distribution priors at various group levels.

Statistics for Shewhart control charts

Description

Calculate statistics that can be used to produce X-bar charts, p-charts, and u-charts. This includes producing means for center lines, 3-sigma upper and lower control limits. Users can also calculate values before and after an intervention to see if a change in the control process happened. Values are returned in a data frame.

Usage

control(
  x,
  y = NULL,
  time,
  data,
  type = "x",
  subset = NULL,
  n.equal = TRUE,
  intervention = NULL
)
control(
  x,
  y = NULL,
  time,
  data,
  type = "x",
  subset = NULL,
  n.equal = TRUE,
  intervention = NULL
)

Arguments

x

character outcome variable in X bar charts or numerator variable name for u-charts or p-charts (when p-chart data is aggregated).

y

character variable name for u-charts or p-charts (when p-chart data is aggregated). When y is present, it becomes the denominator for a rate calculated as x/y. Default is NULL.

time

time variable name.

data

name of data frame object.

type

indicate what type of control chart is needed. Options for the X-bar, p-, and u-charts should be listed as 'x', 'p', and 'u'. Default is the 'x' chart.

subset

an expression defining a subset of the observations to use in the grouping. The default is NULL, thereby using all observations. Specify, for example, data$hospital == "NY" or c(1:100,200:300) respectively to use just those observations.

n.equal

whether there are or we assume equal subgroup (sample) sizes. If n.equal=TRUE, control limits are calculated using the overall mean n value. If n.equal=FALSE, control limits are based on each subgroup's sample size. Default is TRUE.

intervention

a single numeric value for the time when an intervention begins (e.g., intervention=25; intervention begins on the 25th day). Separate means and control limits are calculated pre- and post-intervention. Default is NULL.

Value

data frame of control chart statistics for X-bar charts, p-charts, and u-charts. Includes means, standard deviations, and 3-sigma upper and lower control limit values.

References

Ryan, T. (2011). Statistical Methods for Quality Improvement, Third Edition. New Jersey: John Wiley & Sons, Inc. ISBN: 978-0-470-59074-4.

Examples

## Hospital LOS and readmissions ##
# X-bar chart statistics
spc_x <- control(x="los", time="month", data=hosprog, type="x", n.equal=TRUE)
print(spc_x) # get data frame output

# X-bar chart statistics not assuming equal sample sizes, subsetting for females
spc_x <- control(x="los", time="month", data=hosprog, type="x", n.equal=FALSE,
  subset=hosprog$female==1)
print(spc_x) # get data frame output

# p-chart statistics, using only the numerator (i.e., y=NULL). Specify unequal sample sizes
spc_p <- control(x="rdm30", time="month", data=hosprog, type="p", n.equal=FALSE)
print(spc_p) # get data frame output

# u-chart for infection rates with an intervention at the 22nd month
spc_u <- control(x="HAI", y="PatientDays", time="Month", data=infections,
type="u", n.equal=FALSE, intervention=22)
print(spc_u) # get data frame output
## Hospital LOS and readmissions ##
# X-bar chart statistics
spc_x <- control(x="los", time="month", data=hosprog, type="x", n.equal=TRUE)
print(spc_x) # get data frame output

# X-bar chart statistics not assuming equal sample sizes, subsetting for females
spc_x <- control(x="los", time="month", data=hosprog, type="x", n.equal=FALSE,
  subset=hosprog$female==1)
print(spc_x) # get data frame output

# p-chart statistics, using only the numerator (i.e., y=NULL). Specify unequal sample sizes
spc_p <- control(x="rdm30", time="month", data=hosprog, type="p", n.equal=FALSE)
print(spc_p) # get data frame output

# u-chart for infection rates with an intervention at the 22nd month
spc_u <- control(x="HAI", y="PatientDays", time="Month", data=infections,
type="u", n.equal=FALSE, intervention=22)
print(spc_u) # get data frame output

Statistics for Decision Curve Analysis and logistic regression model classification

Description

Calculate statistics such as sensitivity, specificity, positive and negative predictive values, and net benefit and interventions saved from a decision curve analysis.

Usage

decide(x, threshold)
decide(x, threshold)

Arguments

x

assess regression object, currently logistic regression.

threshold

numeric vector of length == 1 that sets the threshold used to calculate sensitivity, specificity, positive and negative predictive values, and net benefit and interventions saved from a decision curve analysis. Select thresholds on appropriate linear scale such as logits for logistic regression models.

Value

summary of model classification based on the selected threshold/cutoff, area under the curve (AUC)/c-statistic, predicted outcomes (transformed if applicable), decision curve analysis values at various percentiles, and sensitivity and specificity related statistics for the regression model at the specified threshold.

References

Vickers, A. & Elkin, E. (2006). Decision Curve Analysis: A Novel Method for Evaluating Prediction Models. Society for Medical Decision Making, 26, 6, 565-574. https://doi.org/10.1177/0272989X06295361

Examples

## Predicting car engine shape type, v or straight  ##
# run the model
car_m1 <- assess(formula=vs ~ hp + am, data=mtcars, regression="logistic")
# create a decide object, enter the model name and a threshold on the logit scale
d1 <- decide(x=car_m1, threshold= -0.767)
# View model classification related statistics
print(d1$Model.Summary$Classification)

# View decision curve analysis results like 'net benefit' at various thresholds
print(d1$DCA)
## Predicting car engine shape type, v or straight  ##
# run the model
car_m1 <- assess(formula=vs ~ hp + am, data=mtcars, regression="logistic")
# create a decide object, enter the model name and a threshold on the logit scale
d1 <- decide(x=car_m1, threshold= -0.767)
# View model classification related statistics
print(d1$Model.Summary$Classification)

# View decision curve analysis results like 'net benefit' at various thresholds
print(d1$DCA)

Group level confidence intervals and between-group variation

Description

Group level confidence intervals and between-group variation

Usage

group(
  x,
  y,
  z = NULL,
  data,
  dist = "t",
  conf.int = 0.95,
  increment = 1,
  rolling = NULL,
  quarts = FALSE,
  cluster = FALSE,
  subset = NULL,
  asis = FALSE,
  dataf = NULL
)
group(
  x,
  y,
  z = NULL,
  data,
  dist = "t",
  conf.int = 0.95,
  increment = 1,
  rolling = NULL,
  quarts = FALSE,
  cluster = FALSE,
  subset = NULL,
  asis = FALSE,
  dataf = NULL
)

Arguments

x

group predictor variable name.

y

outcome variable name.

z

time period variable name.

data

name of data frame object.

dist

indicate the distribution used for confidence intervals. Options for the t, binomial, and exact Poisson distributions. Options are 't', 'b', and 'p'. Default is the 't'.

conf.int

select the confidence interval level. Default is 0.95.

increment

specify the increment in time periods. Selecting 3 if data uses the month as the unit of time will give confidence intervals, each based on 3 months. Default is 1.

rolling

indicate the number of time periods for the 'rolling average'. The rolling average consists of >1 time periods but subsequent point estimate increase by a unit of 1. For example, the common 12-month rolling average is based on months 1-12 of data, followed by the next estimate using months 2-13, 3-14, and so on until the last month in the data has been reached. Default is NULL.

quarts

logical TRUE or FALSE that indicates whether to convert continuous x into 4 groups based on quartiles of x. Default is FALSE.

cluster

logical TRUE or FALSE to generate measures of between-group variation such as the Intra-Class Correlation, Median Odds Ratio, or Design Effect. Default is FALSE. Uses binary outcome formula (between-group variance/(between-group variance + (3.14^2/3)) for ICC in Rabe-Hesketh which may be more appropriate for multilevel models. ICC, MOR, DE may be less reliable for binomial and Poisson distributions, use caution.

subset

asis

a logical vector that indicates if data will be processed as having only 1 unique observation per 'x' and 'z' combination (i.e., this is intended for use with aggregated data). Default is FALSE. This will allow the plot function to graph single observation data for groups over time. Only the t distribution is used for the overall trend line and confidence band (works in conjunction with 'ocol' and 'oband').

dataf

not currently used, please use 'data' instead.

Value

list of confidence intervals for outcomes by groups, over time, and clustering measures. Some values returned in alphabetical and numerical order based on the group.

References

Merlo, J. (2006). A brief conceptual tutorial of multilevel analysis in social epidemiology: using measures of clustering in multilevel logistic regression to investigate contextual phenomena. Journal of Epidemiological Health, 60, 4, 290-297. https://doi.org/10.1136/jech.2004.029454.

Muthen, B. & Satorra, A. (1995). Complex Sample Data in Structural Equation Modeling. Sociological Methodology, 25, 267-316. https://doi.org/10.2307/271070.

Rabe-Hesketh, S. & Skrondal, A. (2008). Multilevel and Longitudinal Modeling Using Stata, Second Edition. ISBN: 978-1-59718-040-5.

Examples

#default t distribution results
group(x="program", y="los", data=hosprog)
#Rounding LOS to integers
hp2 <- hosprog; hp2$los2 <- round(hp2$los, 0)
#Exact Poisson confidence intervals
group(x="program", y="los2", data=hp2, dist="p")
#Rolling 6-months of data
group(x="program", y="los", z="month", data=hosprog, dist="t", rolling=6)
#Data returned separately for rolling 6-months of data and 3-month increments (e.g., quarters)
group(x="program", y="los", z="month", data=hosprog, dist="t", increment=3, rolling=6)
#Quartile groups for continuous risk score and returned clustering info
group(x="risk", y="los", data=hosprog, quarts=TRUE, cluster=TRUE)
#Binomial distribution with less conservative 90% confidence intervals
group(x="risk", y="rdm30", data=hosprog, quarts=TRUE, dist="b", conf.int=0.90)
#default t distribution results
group(x="program", y="los", data=hosprog)
#Rounding LOS to integers
hp2 <- hosprog; hp2$los2 <- round(hp2$los, 0)
#Exact Poisson confidence intervals
group(x="program", y="los2", data=hp2, dist="p")
#Rolling 6-months of data
group(x="program", y="los", z="month", data=hosprog, dist="t", rolling=6)
#Data returned separately for rolling 6-months of data and 3-month increments (e.g., quarters)
group(x="program", y="los", z="month", data=hosprog, dist="t", increment=3, rolling=6)
#Quartile groups for continuous risk score and returned clustering info
group(x="risk", y="los", data=hosprog, quarts=TRUE, cluster=TRUE)
#Binomial distribution with less conservative 90% confidence intervals
group(x="risk", y="rdm30", data=hosprog, quarts=TRUE, dist="b", conf.int=0.90)

Patient hospital program/intervention data, intervention group only

Description

Patient hospital program/intervention data, intervention group only

Usage

hosp1
hosp1

Format

`hosp1`

An artificial data frame with 352 rows and 10 columns, intervention patients only:

survey: Patient satisfaction survey mean score.
los: Hospital length of stay (los)
cost: Hospital stay cost
rdm30: Patient readmission within 30 days of discharge
death30: Patient death within 30 days of discharge
female: Patient sex, 1 indicates female, 0 otherwise
age: Patient age
risk: Patient health risk score ranging from 0 to 1
month: 12 month indicator (1 to 12)
program: Indicates patient program participation. 1='yes', 0='no'

...

Source

hosp1 is a subset of the artificial dataset hosprog. It is the intervention group's data used for single group interrupted time series.

Patient hospital program/intervention data

Description

Patient hospital program/intervention data

Usage

hosprog
hosprog

Format

`hosprog`

An artificial data frame with 720 rows and 10 columns:

survey: Patient satisfaction survey mean score.
los: Hospital length of stay (los)
cost: Hospital stay cost
rdm30: Patient readmission within 30 days of discharge
death30: Patient death within 30 days of discharge
female: Patient sex, 1 indicates female, 0 otherwise
age: Patient age
risk: Patient health risk score ranging from 0 to 1
month: 12 month indicator (1 to 12)
program: Indicates patient program participation. 1='yes', 0='no'

...

Source

Artificial dataset created by using runif. The strength in the association between each variable is weighted by multiplying each subsequent predictor in increments of 1. For example, Y equals runif(720) multiplied by 1 plus runif(720) multiplied by 2 and so on. This allows some predictors to have stronger correlations with Y.

Importance of variables based on partial chi-square statistic

Description

Calculates partial chi-square (Wald chi-square for individual coefficients) from assess class objects. The importance is the partial chi-square minus its degrees of freedom based on the regression coefficients (Harrell, 2015). A higher chi-square indicates a larger effect by the predictors. Therefore, the rank of the chi-square can indicate which predictors can contribute more in explaining the variation in the outcome variable.

Usage

importance(model)
importance(model)

Arguments

model

an assess class object or models with lm or glm class.

Value

a data.frame object with partial X^2 summary statistics.

References

Harrell, F. E., Jr. (2016). Regression Modeling Strategies. Springer International Publishing. ISBN: 978-3-319-19424-0.

Examples

# OLS regression
importance(assess(mpg ~ hp + wt + cyl, data=mtcars, regression= "ols")$model)

# logistic regression
importance(assess(vs~mpg+wt+hp, data=mtcars, regression= "logistic")$model)

# OLS regression
importance(assess(mpg ~ hp + wt + cyl, data=mtcars, regression= "ols")$model)

# logistic regression
importance(assess(vs~mpg+wt+hp, data=mtcars, regression= "logistic")$model)

Hospital acquired infections (HAI) during 41 months.

Description

Hospital acquired infections (HAI) during 41 months.

Usage

infections
infections

Format

`infections`

An artificial data frame with 41 rows and 3 columns:

Month: Calendar month as an integer, ranging from 1 to 41.
HAI: Count of hospital acquired infections (HAI) within 1 month.
PatientDays: Count of hospital patient days within 1 month (i.e., the sum of days in the hospital for all patients).

...

Source

infections is an artificial data frame of reasonable hospital acquired infections (HAI) estimates made entirely for the purpose of demonstrating control charts.

Interpret model output

Description

Provides simple interpretations of regression coefficients and Cronbach's alpha from assess and alpha function classes. The interpretations describe coefficients and significance values as well as modifying item scales. The interpretations are text comments associated with specific parameters of the various analyses.

Usage

interpret(object)
interpret(object)

Arguments

object

alpha and assess class objects: alpha, ITS, DID, linear (ols) or logistic models.

Value

a list with interpretations of Cronbach's alpha scales or regression model results.

Examples

# Interpret Cronbach's alpha
interpret(alpha(items=c("i1","i2","i3","i4","i5"), data=cas))

# interpret a standard linear (OLS) regression
hos1 <- assess(formula=survey ~ program + month, data=hosprog, regression= "ols")
interpret(hos1)$model

# interpret a differences-in-differences model
hos2 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two", newdata=TRUE)
interpret(hos2)$did  #interpret(hos2) also runs, returns ITS results if present

# interpret an interrupted time series model
hos3 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)
interpret(hos3)$its
# Interpret Cronbach's alpha
interpret(alpha(items=c("i1","i2","i3","i4","i5"), data=cas))

# interpret a standard linear (OLS) regression
hos1 <- assess(formula=survey ~ program + month, data=hosprog, regression= "ols")
interpret(hos1)$model

# interpret a differences-in-differences model
hos2 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two", newdata=TRUE)
interpret(hos2)$did  #interpret(hos2) also runs, returns ITS results if present

# interpret an interrupted time series model
hos3 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)
interpret(hos3)$its

Interrupted time series analysis effects

Description

Calculates effects for intervention and control groups based on interrupted time series models from an assess class object. Within a period (or interruption), the effect that represents the trend during the period is calculated for both groups as well as the difference between the groups. Summary statistics are provided that include the effect sizes, t-statistic, standard errors, p-values, and 95% confidence intervals of the effect sizes. These values are provided for the intervention group, control group (when applicable), and the differences between the two groups (Linden, 2015). These values are automatically generated while running a model in assess.

Usage

itsEffect(model, type, interruptions)
itsEffect(model, type, interruptions)

Arguments

model

an interrupted time series (ITS) model with the "lm" class,

type

analysis type for single or multiple groups and single or multiple time periods. If selected type="sgst", it is single-group single-time; type="sgmt", it is single-group multiple-time; type="mgst", it is multiple-group single-time; and type="mgmt", it is multiple-group multiple-time.

interruptions

ITS number of interruptions as numeric vector greater than 0.

Value

a data.frame object of ITS effects and summary statistics. Row names are indicated with first, second, and additional numbers when there are multiple interruptions (e.g., 'Intervention.2'). Generally run within the assess function.

References

Linden, Ariel. (2015). Conducting Interrupted Time-series Analysis for Single- and Multiple-group Comparisons. The Stata Journal, 2015, 15(2), 480-500, https://doi.org/10.1177/1536867X1501500208

Examples

i21 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",topcode =NULL,
int.time="month", regression="none", interrupt=5, its="two", newdata=TRUE, propensity=NULL)
itsEffect(model= i21$ITS, type= "mgst", interruptions= 1)

i21 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",topcode =NULL,
int.time="month", regression="none", interrupt=5, its="two", newdata=TRUE, propensity=NULL)
itsEffect(model= i21$ITS, type= "mgst", interruptions= 1)

Markov Chain Monte Carlo estimates of hospital length of stay from the hosprog data frame

Description

Markov Chain Monte Carlo estimates of hospital length of stay from the hosprog data frame

Usage

losmcmc
losmcmc

Format

`losmcmc`

MCMC list of 4 chains with 5000 rows and 2 columns in each of the list elements:

muOfY: Estimate of the mean parameter.
sigmaOfY: Estimate of the square root of variance parameter.

...

Source

losmcmc is a list of 4 MCMC simulations. It estimates the mean and standard deviation using the artificial hosprog data frame from ham. An estimation was specified using a normal distribution maximum likelihood and normal and uniform distribution priors for the mean and sigma parameters.

Prediction plot of treatment and control groups for DID and ITS models

Description

Provides partial prediction plots for treatment and control groups from difference-in-difference (DID) and interrupted time series (ITS) models. The graph will produce lines for treatment/intervention and control groups to gain understanding through a visual representation of the regression coefficients. By default, the treatment/intervention group is represented with a blue line, the control group is represented with a red line, and the counterfactual line, when available, is a dashed line. There are many options to change the plot.

Usage

## S3 method for class 'assess'
plot(
  x,
  y,
  xlim = NULL,
  ylim = NULL,
  xlab = NULL,
  ylab = NULL,
  main = NULL,
  lwd = NULL,
  col = NULL,
  tcol = NULL,
  cfact = FALSE,
  conf.int = FALSE,
  adj.alpha = NULL,
  add.means = FALSE,
  add.legend = NULL,
  legend = NULL,
  tgt = NULL,
  tgtcol = "gray",
  cex = 1,
  cex.axis = NULL,
  cex.lab = NULL,
  cex.main = NULL,
  cex.text = NULL,
  cex.legend = NULL,
  name = FALSE,
  coefs = FALSE,
  round.c = NULL,
  pos.text = NULL,
  arrow = FALSE,
  xshift = NULL,
  x.axis = NULL,
  ...
)
## S3 method for class 'assess'
plot(
  x,
  y,
  xlim = NULL,
  ylim = NULL,
  xlab = NULL,
  ylab = NULL,
  main = NULL,
  lwd = NULL,
  col = NULL,
  tcol = NULL,
  cfact = FALSE,
  conf.int = FALSE,
  adj.alpha = NULL,
  add.means = FALSE,
  add.legend = NULL,
  legend = NULL,
  tgt = NULL,
  tgtcol = "gray",
  cex = 1,
  cex.axis = NULL,
  cex.lab = NULL,
  cex.main = NULL,
  cex.text = NULL,
  cex.legend = NULL,
  name = FALSE,
  coefs = FALSE,
  round.c = NULL,
  pos.text = NULL,
  arrow = FALSE,
  xshift = NULL,
  x.axis = NULL,
  ...
)

Arguments

x

assess object. Either difference-in-difference or interrupted time series model with no covariate adjustment.

y

type of model, specify either 'DID' (difference-in-difference) or 'ITS' (interrupted time series). Will not accept other models.

xlim

specify plot's x-axis limits with a 2 value vector.

ylim

specify plot's y-axis limits with a 2 value vector.

xlab

a vector label for the x-axis.

ylab

a vector label for the y-axis.

main

the main title of the plot.

lwd

select the line width.

col

specify intervention and control group colors in a vector. Defaults to, if nothing selected, c("blue", "red") or "blue" for single-group Interrupted Time Series models.

tcol

specify treatment or interruption line color as a single character vector. Defaults to "gray" if nothing selected.

cfact

logical TRUE or FALSE that indicates whether a counterfactual line should be included. Defaults to FALSE.

conf.int

logical TRUE or FALSE that indicates whether a 95% confidence interval bands should be included. Defaults to FALSE.

adj.alpha

factor modifying the opacity alpha of the confidence interval bands, in the range of 0 to 1. Default is NULL; if conf.int=TRUE, defaults to 0.4.

add.means

adds group means by time period based on model data. Default is FALSE

add.legend

add a legend by selecting the location as "bottomright", "bottom", "bottomleft", "left", "topleft", "top", "topright", "right", "center". No legend if nothing selected.

legend

a character vector of length >= 1 to appear in the DID and ITS model legends to represent the Intervention and Control groups.

tgt

specify 1 or more values on the x-axis of where to add a target line when y='group'. Or 1 or more values on the y-axis of where to add a target line when y='time' or 'roll'. Default is NULL.

tgtcol

select a color for the target line. Default is 'gray'.

cex

A numerical value giving the amount by which plotting text and symbols should be magnified relative to the default of 1.

cex.axis

The magnification to be used for axis annotation relative to the current setting of cex.

cex.lab

The magnification to be used for x and y labels relative to the current setting of cex.

cex.main

The magnification to be used for main titles relative to the current setting of cex.

cex.text

The magnification to be used for the text added into the plot relative to the current setting of 1.

cex.legend

The magnification to be used for the legend added into the plot relative to the current setting of 1.

name

logical TRUE or FALSE that indicates whether coefficient names should be added to the plot. Default is FALSE. It is overridden if coefs = TRUE.

coefs

logical TRUE or FALSE that indicates whether coefficient names, values, and p-value significance symbols ('+' p<0.10; '' p<0.05; '' p<0.01; '' p<0.001) should be added to the plot. Default is FALSE. coefs = TRUE overrides name = FALSE.

round.c

an integer indicating the number of decimal places to be used for rounding coefficient values. Default is 2.

pos.text

a list of named integer value(s) between 1 to 4 indicating the position of the text added into the plot. List name(s) should use coefficient variable names.

arrow

logical TRUE or FALSE that indicates whether arrows and coefficient names should be added to visualize effects. Default is FALSE.

xshift

shifts one or two of some of the overlapping intervention associated arrows along the x-axis for a better view. Vector values of at least length 1 or 2 can be positive or negative. And xshift should be specified in the order of the coefficients. Only 1 or 2 of the furthest right, vertical lines for the intervention group is shifted (i.e., not left). One line is shifted when there is 1 treatment/interruption period and 2 shifts for 2 periods. (e.g., "DID" before "DID.Trend" for DID models with argument did="many").

x.axis

a vector of unique character or numeric values that makes up x-axis values to replace the intervention time variable values. This will be most helpful if you prefer current calendar years instead of values starting at 1 (e.g., x.axis= sort(unique(data$Year)) for 1900-1999, not 1-100). Must have equal lengths for unique x.axis values and unique replaced values. Default is NULL.

...

additional arguments.

Value

plot of partial predictions for treatment and control groups.

Examples

am2 <- assess(formula= los ~ ., data=hosprog, intervention = "program",
topcode =NULL, int.time="month", regression="none", treatment= 5,
interrupt=c(5,9), did="two", its="two", newdata=TRUE, propensity=NULL)
plot(am2, "DID", add.legend="bottomleft", ylim=c(2, 8))  #DID model, basic plot
plot(am2, "ITS", add.legend="top", ylim=c(2, 8))         #ITS model, basic plot
plot(am2, "DID", add.legend="topleft", main="DID study", col=c("dodgerblue","magenta"),
ylim=c(2, 8), lwd=6, cex=3, cex.axis=2, cex.lab=1.5, cex.main=3, cex.text=2,
arrow=TRUE, xshift=0.02, coefs=TRUE, round.c=2 )
plot(am2, "ITS", add.legend="top", xlim=c(-.5, 13), ylim=c(2, 8), main="ITS study",
col=c("cyan","hotpink"), tcol="springgreen", lwd=7, cex=2, cex.axis=2, cex.lab=2,
cex.main=3, cex.text=1.2, cex.legend=1.25, name=FALSE, coefs=TRUE, round.c=1,
pos.text= list("txp5"=3, "post9"=4), arrow=TRUE, xshift=c(.5, 1.5),
cfact=TRUE, conf.int=TRUE, adj.alpha=0.2)
# ITS: USA's unemployment rate with a single group, 10 interruption model
key_time <- c(5, 14, 17, 29, 42, 59, 69, 73, 80,92)# key interruption periods
u110 <- assess(formula=rate ~ ., data=unemployment, intervention="usa",# model
               int.time="year", its="one", interrupt= key_time)
# Graph the results, note shift in intercept and slope during 1933's New Deal
plot(u110, "ITS", add.means = TRUE, coefs=TRUE, conf.int=TRUE, adj.alpha = .2,
     lwd=2, col="slategray", tcol= "orange", main="US unemployment rate",
     xlab="Years (1929-2024)", ylab="Proportion of labor market",
     cex.main=2, cex.axis = 1.5, cex.lab = 1.5, cex=2, cex.text = 1.25,
     pos.text=list("ITS.Time"=4, "post42"=1,"txp42"=3,"txp92"=3), x.axis=unemployment$Year)
for(i in 1:length(key_time)) {
  text(key_time[i], .22-(.01*i), cex=1.25, labels =
         paste0(unemployment[ key_time[i], "Year"], ": ", unemployment[ key_time[i], "event"]))
}
am2 <- assess(formula= los ~ ., data=hosprog, intervention = "program",
topcode =NULL, int.time="month", regression="none", treatment= 5,
interrupt=c(5,9), did="two", its="two", newdata=TRUE, propensity=NULL)
plot(am2, "DID", add.legend="bottomleft", ylim=c(2, 8))  #DID model, basic plot
plot(am2, "ITS", add.legend="top", ylim=c(2, 8))         #ITS model, basic plot
plot(am2, "DID", add.legend="topleft", main="DID study", col=c("dodgerblue","magenta"),
ylim=c(2, 8), lwd=6, cex=3, cex.axis=2, cex.lab=1.5, cex.main=3, cex.text=2,
arrow=TRUE, xshift=0.02, coefs=TRUE, round.c=2 )
plot(am2, "ITS", add.legend="top", xlim=c(-.5, 13), ylim=c(2, 8), main="ITS study",
col=c("cyan","hotpink"), tcol="springgreen", lwd=7, cex=2, cex.axis=2, cex.lab=2,
cex.main=3, cex.text=1.2, cex.legend=1.25, name=FALSE, coefs=TRUE, round.c=1,
pos.text= list("txp5"=3, "post9"=4), arrow=TRUE, xshift=c(.5, 1.5),
cfact=TRUE, conf.int=TRUE, adj.alpha=0.2)
# ITS: USA's unemployment rate with a single group, 10 interruption model
key_time <- c(5, 14, 17, 29, 42, 59, 69, 73, 80,92)# key interruption periods
u110 <- assess(formula=rate ~ ., data=unemployment, intervention="usa",# model
               int.time="year", its="one", interrupt= key_time)
# Graph the results, note shift in intercept and slope during 1933's New Deal
plot(u110, "ITS", add.means = TRUE, coefs=TRUE, conf.int=TRUE, adj.alpha = .2,
     lwd=2, col="slategray", tcol= "orange", main="US unemployment rate",
     xlab="Years (1929-2024)", ylab="Proportion of labor market",
     cex.main=2, cex.axis = 1.5, cex.lab = 1.5, cex=2, cex.text = 1.25,
     pos.text=list("ITS.Time"=4, "post42"=1,"txp42"=3,"txp92"=3), x.axis=unemployment$Year)
for(i in 1:length(key_time)) {
  text(key_time[i], .22-(.01*i), cex=1.25, labels =
         paste0(unemployment[ key_time[i], "Year"], ": ", unemployment[ key_time[i], "event"]))
}

Bayesian plots for various analyses

Description

Graph Bayesian diagnostic traceplots, density plots on convergence, autocorrelation factor, calculate Monte Carlo Standard Errors, and effective sample sizes. And plot summaries of posterior distributions, posterior predictive checks, summaries on hierarchical or multilevel models (up to 3 levels), and summary graphs of values associated with specific percentiles (and vice-versa) that can be used to help set targets. Plots are developed from Bayes class objects that converted multiple chains into data frames.

Usage

## S3 method for class 'Bayes'
plot(
  x,
  y = NULL,
  type = "n",
  parameter = NULL,
  center = "mode",
  mass = 0.95,
  compare = NULL,
  rope = NULL,
  data = NULL,
  dv = NULL,
  iv = NULL,
  add.data = "n",
  group = NULL,
  main = NULL,
  xlab = NULL,
  ylab = NULL,
  xlim = NULL,
  ylim = NULL,
  vlim = NULL,
  curve = FALSE,
  lwd = NULL,
  breaks = 15,
  bcol = NULL,
  lcol = NULL,
  pcol = NULL,
  xpt = NULL,
  tgt = NULL,
  tgtcol = "gray",
  tpline = NULL,
  tpcol = NULL,
  pline = 20,
  pct = 95,
  add.legend = NULL,
  legend = NULL,
  cex = 1,
  cex.lab = NULL,
  cex.axis = NULL,
  cex.main = NULL,
  cex.text = NULL,
  cex.legend = NULL,
  HDItext = 0.7,
  math = "n",
  es = "n",
  subset = NULL,
  level = NULL,
  aorder = TRUE,
  round.c = 2,
  ...
)
## S3 method for class 'Bayes'
plot(
  x,
  y = NULL,
  type = "n",
  parameter = NULL,
  center = "mode",
  mass = 0.95,
  compare = NULL,
  rope = NULL,
  data = NULL,
  dv = NULL,
  iv = NULL,
  add.data = "n",
  group = NULL,
  main = NULL,
  xlab = NULL,
  ylab = NULL,
  xlim = NULL,
  ylim = NULL,
  vlim = NULL,
  curve = FALSE,
  lwd = NULL,
  breaks = 15,
  bcol = NULL,
  lcol = NULL,
  pcol = NULL,
  xpt = NULL,
  tgt = NULL,
  tgtcol = "gray",
  tpline = NULL,
  tpcol = NULL,
  pline = 20,
  pct = 95,
  add.legend = NULL,
  legend = NULL,
  cex = 1,
  cex.lab = NULL,
  cex.axis = NULL,
  cex.main = NULL,
  cex.text = NULL,
  cex.legend = NULL,
  HDItext = 0.7,
  math = "n",
  es = "n",
  subset = NULL,
  level = NULL,
  aorder = TRUE,
  round.c = 2,
  ...
)

Arguments

x

Bayes class object.

y

character vector for the type of plot to graph. Select 'post', 'dxa', 'dxd', 'dxg', 'dxt', 'check', 'multi' (specialized version of 'check'), or 'target' for posterior summary, diagnostics (4 'dx' plots produced: autocorrelation factor, density plots on chain convergence, Gelman-Rubin statistic, and traceplot), posterior predictive check, multilevel or hierarchical model summary (up to 3 levels), or target summary plots. Default is 'post'.

type

character vector of length == 1 that indicates the likelihood function used in the model when y='check' or y='multi'. Posterior predictive checks allow us to see how well our estimates match the observed data. These checks are available for Bayesian estimation of outcomes and regression trend lines (with polynomial terms) using various distributions in the likelihood function. Select 'n', 'ln', 'sn', 'w', 'g', 't', 'taov', 'taov1', 'ol', 'oq','oc', 'lnl', 'lnq', 'lnc', 'logl', 'logq', 'logc', 'bern', and 'bin' for these respective options in Bayesian estimation (multilevel): 'Normal', 'Log-normal', 'Skew-normal', 'Weibull', 'Gamma', 't', 't: ANOVA, side view', 't: ANOVA 1 group, side view'; and for regression trend lines: 'OLS: Linear', 'OLS: Quadratic', 'OLS: Cubic', 'Log-normal: Linear', 'Log-normal: Quadratic', 'Log-normal: Cubic', 'Logistic: Linear', 'Logistic: Quadratic', and 'Logistic: Cubic', 'Bernoulli', and 'binomial'. The first 8 selections are for Bayesian estimation of outcomes, the next 9 options were developed to assess regression trend lines from ordinary least squares (OLS), log-normal, and logistic models and also for hierarchical model versions. And the remaining two ('bern' and 'bin') are for when y='multi'. Additional models analogous to 'Generalized Linear Models' can also be graphed on the logit scale using 'OLS' options. For example, plot a logistic model on the logit scale when type='ol' (i.e., view straight trend lines). Or if you prefer viewing results on the probability scale, select type='logl' (i.e., curved lines). And consider using type= 'lnl', 'lnq', 'lnc' for log-normal and Poisson models with lines based on exponentiated values. In general, it is important to note that the observed data may not be on the same scale as the parameter estimates and may not be automatically visible in the graph (i.e., x and y-axis limits may help). When graphing target summary plots that use posterior predictive checks rather than the basic posterior summary graph (plots target values over the parameter estimate of the center such as the mean), enter y='target' and other arguments relevant to y='check'. However, type= 'bern', 'bin', 'sn' are not available but type='n', 'ln', 'w', 'g', or 't' are available. Default is NULL.

parameter

a character vector of length >= 1 or a 2 element list with the name(s) of parameter in MCMC chains to produce summary statistics. Use a 1 element vector to get posterior estimates of a single parameter. Use a 2 or more element vector to estimate the average joint effects of multiple parameters (e.g., average infection rate for interventions A and B when parameter= c('IntA', 'IntB')). Use a 2 element list to perform mathematical calculations of multiple parameters (see 'math' below). For example, use parameter=list('hospital_A', 'hospital_Z') if you want to estimate the difference between the hospital's outcomes. Use parameter= list(c('hospital_A','hospital_B'), ('hospital_Y','hospital_Z')) to estimate how different the combined hospitals A and B values are from the combined Hospital Y and Z values. When y='check', use either a multiple element character vector that represents center, spread, and additional distribution parameters in order of 1st, 2nd, and 3rd distribution parameters. For example, mean and sd for a normal distribution; mean log and sd log of a log-normal dist.; xi, omega, and alpha of a skew-normal distribution; shape, scale, and lambda of a Weibull distribution; shape and rate of a Gamma distribution; and mean, SD and nu (i.e., degrees of freedom) of a t-distribution. Or indicate regression parameters in order (e.g., intercept, Beta 1, Beta 2, etc.). When y='multi', use a multiple element character vector to list the parameter names of the hierarchy, in order of the nesting with the lowest level first (e.g., exams nested in patients nested in hospital). When y='multi', for parameters from multiple groups such as various hospitals, only enter the first unit's prefix of each parameter and the remaining groups will be set up for graphing. For example, parameter=c('theta', 'omega') will plot data for theta[1] to theta[8] and omega[1] to omega[8] for all 8 hospitals as well.

center

character vector that selects the type of central tendency to use when reporting parameter values. Choices include: 'mean', 'median', and 'mode'. Default is 'mode'.

mass

numeric vector that specifies the credible mass used in the Highest Density Interval (HDI). Default is 0.95.

compare

numeric vector with one comparison value to determine how much of the distribution is above or below the comparison value. Default is NULL.

rope

numeric vector with two values that define the Region of Practical Equivalence (ROPE). Test hypotheses by setting low and high values to determine if the Highest Density Interval (HDI) is within or outside of the ROPE. Parameter values are declared not credible if the entire ROPE lies outside the HDI of the parameter’s posterior (i.e., we reject the null hypothesis). For example, the ROPE of a coin is set to 0.45 to 0.55 but the posterior 95% HDI is 0.61 - 0.69 so we reject the null hypothesis value of the rate of a head is 0.50. We can accept the null hypothesis if the entire 95% HDI falls with the ROPE. Default is NULL.

data

object name for the observed data when y='check', y='multi' or y='target'.

dv

character vector of length == 1 for the dependent variable name in the observed data frame when y='check', y='multi' or y='target'. Default is NULL.

iv

character vector of length >= 1 for the independent variable name(s) in the observed data frame when y='check' or y='multi'. When y='multi', enter the lower to higher level clustering or group names (e.g, for health data, iv=c("patient", "hospital"). When type='taov', enter the name of the test group variable (e.g., 'intervention'). Default is NULL.

add.data

character vector of length == 1 to determine the type of observed data added to the plot (to show model fit to data) when y='check' and type= 'ol', 'oq','oc', 'lnl', 'lnq', 'lnc', 'logl', 'logq', or 'logc'. Select 'a', 'u', 'al', 'ul', 'n' for these observed data options: 'All', 'Unit', 'All: Lines', 'Unit: Lines' (unit specific lines are linked), 'none'. Default is 'n' for none or no observed data shown.

group

character list of length == 2 for 1) the grouping variable name and 2) specific group(s) in the observed data frame. This is primarily used for multilevel or hierarchical models when y='check' or y='multi' that the hierarchies are based on (e.g., hospitals nested within health systems).

main

the main title of the plot.

xlab

a character vector label for the x-axis.

ylab

a character vector label for the y-axis.

xlim

specify plot's x-axis limits with a 2 element numeric vector.

ylim

specify plot's y-axis limits with a 2 element numeric vector.

vlim

two element vector to specify limits for minimum and maximum values used to extrapolate posterior lines along the x-axis. For example, when drawing a log-normal distribution, we may want to have our posterior lines fit within a narrower range while having our graph's x-axis limits extend past those lines. If so, the value limits (vlim) help us keep our posterior predictive check lines within desired limits. Default is NULL.

curve

select a curve to display instead of a histogram when y='post'. Default is FALSE.

lwd

select the line width.

breaks

number of breaks in a histogram. Default is 15.

bcol

a single or multiple element character vector to specify the bar or band color(s). When Bayesian estimates and observed values are present, the first colors are for Bayesian estimates while the last colors are observed values. Defaults to, if nothing selected, 'gray', except when y = 'multi' and then no overall HDI is graphed until a color is selected.

lcol

a single or multiple element character vector to specify the line color(s). When Bayesian estimates and observed values are present, the first colors are Bayesian estimates while the last colors are observed values. When multiple lines are needed, single item lines precede multiple use lines. For example, a single comparison value line will be assigned the first lcol while both rope lines will be given the same color of the second lcol when y='post'. Defaults to 'gray' if nothing selected.

pcol

a single or multiple element character vector to specify the point color(s). When Bayesian estimates and observed values are present, the first colors are Bayesian estimates while the last colors are observed values. Defaults to, if nothing selected, 'gray'.

xpt

a numeric vector of single or multiple values that indicate placement of points (+) on the x-axis when y='check'. This is intended for the graphs with predictive checks on Bayesian estimation (i.e., not trend lines). Default is NULL.

tgt

specify 1 or more values on the x- or y-axis of where to add one or more target lines when applicable. Default is NULL.

tgtcol

select one or multiple colors for one or multiple target lines. Default is 'gray'.

tpline

add one or more time point vertical lines using x-axis values. Default is NULL (i.e., no lines).

tpcol

specify a color for the time point line, tpline. Default is NULL.

pline

a numeric vector of length == 1 for the number of random posterior predictive check lines when y='check'. Default is 20.

pct

a numeric integer vector of length == 1 for the percentage of the posterior predictive check heavy tail lines to be drawn when type= 'taov' or 'taov1'. Valid values are 0 < pct < 100. Default is 95 (e.g., 95%).

add.legend

add a legend by selecting the location as "bottomright", "bottom", "bottomleft", "left", "topleft", "top", "topright", "right", "center". Default is no legend produced if nothing is selected.

legend

a character vector of length >= 1 to appear when y='check', y='multi', and sometimes y='target'. Legends to represent hierarchical estimates and observed values.

cex

A numerical value giving the amount by which plotting text and symbols should be magnified relative to the default of 1.

cex.lab

The magnification to be used for x and y labels relative to the current setting of cex.

cex.axis

The magnification to be used for axis annotation relative to the current setting of cex.

cex.main

The magnification to be used for main titles relative to the current setting of cex.

cex.text

The magnification to be used for the text added into the plot relative to the current setting of 1.

cex.legend

The magnification to be used for the legend added into the plot relative to the current setting of 1.

HDItext

numeric vector of length == 1 that can be a negative or positive value. Identifies placement of HDI text near credible interval when y='post'. Values are relative to the x-axis values. Default is 0.7.

math

mathematics function performed between multiple parameters when y='post'. Available functions are: 'add', 'subtract', 'multiply', 'divide', or 'n' for none (i,e., no functions). Indicate parameters with parameter argument. For example, when math='subtract', use parameter=list('hospital_A', 'hospital_Z') if you want to estimate the difference between the hospital's outcomes. Use parameter=list(c('hospital_A','hospital_B'), ('hospital_Y','hospital_Z')) to estimate how different the combined hospitals A and B values are from the combined hospitals Y and Z. Additionally, compute statistics like the coefficient of variation when math='divide' and parameter= list('Standard_Deviation', 'Mean'). Default is 'n' for no math function.

es

one element vector that indicates which type of likelihood distribution is relevant in calculating Jacob Cohen's effect sizes between 2 parameters when y='post'. Options are 'bern', 'bin', and 'n' for the Bernoulli or binomial distributions for binary outcomes and none (i.e., no distribution, hence no effect size calculated). For example, to get the posterior distribution summary for the difference between the intervention and control groups on 30-day readmissions or not, use es='bern' or 'bin' when y='post', math='subtract', and parameter=list('intMean', 'ctlMean'). Default is 'n' which indicates not to calculate the effect size.

subset

a single or multiple element character or numeric vector of group names that are a subset of the observations to use in the grouping when y='multi'. The default is NULL, thereby using all observations. Specify, for example, enter c('NY', 'Toronto', 'LA', 'Vancouver') to view a graph with only these cities. Default is NULL.

level

a numeric integer of length == 1, either 1, 2, or 3 that indicates the level of the hierarchical/multilevel model when y='multi' and the type of graph to plot. For example, a multilevel model that estimates the proportion of successful exams by patients is considered level=2. And the successful exam rates by patients from various hospitals is level=3. Graphs can be created separately for both level=2 and level=3 when there is a three-level model. The graph when y='multi' can be produced when level=1 for non-hierarchical models if there are estimates for groups. For example, estimating the patient infection rate of hospitals without a hierarchical structure in the model. Default is NULL.

aorder

a logical indicator on whether the ordering of the group levels are in alphabetical order or not when y='multi'. If aorder=TRUE, results are displayed in an increasing alphabetical order based on level name (e.g., 'LA' before 'NY'). If aorder=FALSE, an increasing numeric order based on group parameter values is performed (e.g., 0.65 before 0.70). Default is TRUE.

round.c

an integer indicating the number of decimal places when rounding numbers y='multi' and y='target'. Default is 2.

...

additional arguments.

Value

plots assessing model quality, posterior distributions, posterior predictive checks, hierarchical or multilevel model summaries, and target summaries.

References

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, Second Edition. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers. ISBN 0-8058-0283-5 Gelman, A., Goodrich, B., Gabry, J., & Vehtari, A. (2019). R-squared for Bayesian Regression Models. The American Statistician, 73, 3, 307–309. https://doi.org/10.1080/00031305.2018.1549100

Kruschke, J. (2014). Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan, Second Edition. New York: Academic Press. ISBN: 9780124058880

Examples

# Posterior estimates of length of stay #
# Set up the Bayes object first, convert the chains
blos1 <- Bayes(losmcmc, newdata=TRUE)

#' # Examine Bayesian model diagnostics for estimated length of stay (LOS) #
# Review traceplots, density plots, autocorrelation factor, and Gelman-Rubin statistic
plot(blos1, y="dxt", parameter="muOfY")
plot(blos1, y="dxd", parameter="muOfY")
plot(blos1, y="dxa", parameter="muOfY")
plot(blos1, y="dxg", parameter="muOfY")

# And we can calculate statistics that combine multiple parameters (e.g., calculate
# the mean difference between intervention and control groups). Here we get the
# coefficient of variation by dividing the standard deviation by the mean. We use
# the 'math' argument and arrange our parameters in the proper order.
# We only need the first 4 arguments but we'll add a little more.
plot(x=blos1, y="post", parameter=list("sigmaOfY", "muOfY" ),math="divide",
bcol="cyan", HDItext=.3, main= "Coefficient of Variation")

# Posterior Predictive Checks on how well our model fits the data #
# Estimating center and spread for hospital length of stay
# Generally, we only need the first 6 arguments but we'll modify the plot.
# A model with a gamma likelihood would fit better but this is ok for now.
plot(x=blos1, y="check", type="n", data=hosprog, dv="los",
parameter=c("muOfY", "sigmaOfY"), breaks=30, cex.axis=1.3, lwd=3, xlab=NULL,
pline=20, vlim=c(-2, 20), xlim=c(-2, 20), add.legend="topright",
main="Length of Stay", cex.main=1.5, xpt=5, pcol="red", lcol="orange",
cex.legend=1, bcol="cyan")

# Estimating the regression trend line
# Now lets look at the trend of conc on CO2 uptake from the CO2 data.
# Using a quadratic model with conc^2 would help and an option in ham.
# First, create the Bayes object
bco2 <- Bayes(x=co2mcmc, newdata=TRUE )
# We generally only need the first 7 arguments below.
plot(x=bco2, y="check", type="ol", data=CO2, dv="uptake", iv="conc",
parameter=c("b0","b1"), add.data="al", cex.axis=1.3, lwd=1.5, pline=50,
vlim=c(50, 1100), xlim=c(0, 1100), ylim=c(0, 50), cex=2, cex.lab=2,
pcol="magenta", cex.main=2,cex.legend=1.2,  add.legend="topleft",
lcol="steelblue")              #vlim lets me extrapolate a little

# Hierarchical or Multilevel Model Summary #
# We generally only need the first 3 arguments below. But we'll subset
# on 8 of 12 plants in the level 2 model (observations nested in plants)
# and modify other settings.
plot(x=co2multi, y="multi", level=2, aorder=FALSE,
subset= c("Qn2","Qn3","Qc3","Qc2","Mn3","Mn2","Mc2","Mc3"),
lcol="blue", pcol= c("red", "skyblue"), round.c=1, bcol="yellow",
xlim=c(-.1, 1), legend=NULL, add.legend="topright", lwd=3, cex.lab=1.2,
cex= 2, cex.main=1.5, cex.axis=.75, cex.legend=1.5, X.Lab=NULL)
# And now the level 3 plot (observation in plants in Treatment by type groups)
plot(x=co2multi, y="multi", level=3, aorder=FALSE, lcol="blue", pcol= c("green", "pink"),
round.c=1, bcol="lavender", xlim=c(-.1, 1), legend=NULL, add.legend="right", lwd=3,
cex.lab =1.2, cex= 2, cex.main=1.5, cex.axis=.75, cex.legend=1.5, X.Lab=NULL)

# Targets for length of stay (LOS) #
# Our administrators ask how far are we from our goals, they ask about targets
# in increments of 5 points of probability or specific days. We answer both.
btarget1 <- Bayes(x=losmcmc, y="target", type="n", parameter=c("muOfY","sigmaOfY"),
newdata=TRUE, target=list(p=c(.35,.4,.45, .5, .55),  y=c(3,4))) # 'newdata' for plots
# Target graph using the standard option, overlaying estimate of mode parameter
plot(x=btarget1, y="target", type="n", lcol="purple", tgtcol="blue", xlim=c(3.5, 5))
# Target graph using a posterior predictive check, more intuitive
plot(x=btarget1, y="target", type="n", data=hosprog, dv="los", breaks=30,
cex.axis=1.3, lwd=3, pline=20, vlim=c(-1, 12), xlim=c(-1, 10),
parameter=c("muOfY","sigmaOfY"), add.legend="topright", main="Length of Stay",
cex.main=1.5, xpt=5, pcol="black", lcol="cyan", tgtcol="blue", bcol="orange",
cex.legend=1.5, cex.text = 2)
# Posterior estimates of length of stay #
# Set up the Bayes object first, convert the chains
blos1 <- Bayes(losmcmc, newdata=TRUE)

#' # Examine Bayesian model diagnostics for estimated length of stay (LOS) #
# Review traceplots, density plots, autocorrelation factor, and Gelman-Rubin statistic
plot(blos1, y="dxt", parameter="muOfY")
plot(blos1, y="dxd", parameter="muOfY")
plot(blos1, y="dxa", parameter="muOfY")
plot(blos1, y="dxg", parameter="muOfY")

# And we can calculate statistics that combine multiple parameters (e.g., calculate
# the mean difference between intervention and control groups). Here we get the
# coefficient of variation by dividing the standard deviation by the mean. We use
# the 'math' argument and arrange our parameters in the proper order.
# We only need the first 4 arguments but we'll add a little more.
plot(x=blos1, y="post", parameter=list("sigmaOfY", "muOfY" ),math="divide",
bcol="cyan", HDItext=.3, main= "Coefficient of Variation")

# Posterior Predictive Checks on how well our model fits the data #
# Estimating center and spread for hospital length of stay
# Generally, we only need the first 6 arguments but we'll modify the plot.
# A model with a gamma likelihood would fit better but this is ok for now.
plot(x=blos1, y="check", type="n", data=hosprog, dv="los",
parameter=c("muOfY", "sigmaOfY"), breaks=30, cex.axis=1.3, lwd=3, xlab=NULL,
pline=20, vlim=c(-2, 20), xlim=c(-2, 20), add.legend="topright",
main="Length of Stay", cex.main=1.5, xpt=5, pcol="red", lcol="orange",
cex.legend=1, bcol="cyan")

# Estimating the regression trend line
# Now lets look at the trend of conc on CO2 uptake from the CO2 data.
# Using a quadratic model with conc^2 would help and an option in ham.
# First, create the Bayes object
bco2 <- Bayes(x=co2mcmc, newdata=TRUE )
# We generally only need the first 7 arguments below.
plot(x=bco2, y="check", type="ol", data=CO2, dv="uptake", iv="conc",
parameter=c("b0","b1"), add.data="al", cex.axis=1.3, lwd=1.5, pline=50,
vlim=c(50, 1100), xlim=c(0, 1100), ylim=c(0, 50), cex=2, cex.lab=2,
pcol="magenta", cex.main=2,cex.legend=1.2,  add.legend="topleft",
lcol="steelblue")              #vlim lets me extrapolate a little

# Hierarchical or Multilevel Model Summary #
# We generally only need the first 3 arguments below. But we'll subset
# on 8 of 12 plants in the level 2 model (observations nested in plants)
# and modify other settings.
plot(x=co2multi, y="multi", level=2, aorder=FALSE,
subset= c("Qn2","Qn3","Qc3","Qc2","Mn3","Mn2","Mc2","Mc3"),
lcol="blue", pcol= c("red", "skyblue"), round.c=1, bcol="yellow",
xlim=c(-.1, 1), legend=NULL, add.legend="topright", lwd=3, cex.lab=1.2,
cex= 2, cex.main=1.5, cex.axis=.75, cex.legend=1.5, X.Lab=NULL)
# And now the level 3 plot (observation in plants in Treatment by type groups)
plot(x=co2multi, y="multi", level=3, aorder=FALSE, lcol="blue", pcol= c("green", "pink"),
round.c=1, bcol="lavender", xlim=c(-.1, 1), legend=NULL, add.legend="right", lwd=3,
cex.lab =1.2, cex= 2, cex.main=1.5, cex.axis=.75, cex.legend=1.5, X.Lab=NULL)

# Targets for length of stay (LOS) #
# Our administrators ask how far are we from our goals, they ask about targets
# in increments of 5 points of probability or specific days. We answer both.
btarget1 <- Bayes(x=losmcmc, y="target", type="n", parameter=c("muOfY","sigmaOfY"),
newdata=TRUE, target=list(p=c(.35,.4,.45, .5, .55),  y=c(3,4))) # 'newdata' for plots
# Target graph using the standard option, overlaying estimate of mode parameter
plot(x=btarget1, y="target", type="n", lcol="purple", tgtcol="blue", xlim=c(3.5, 5))
# Target graph using a posterior predictive check, more intuitive
plot(x=btarget1, y="target", type="n", data=hosprog, dv="los", breaks=30,
cex.axis=1.3, lwd=3, pline=20, vlim=c(-1, 12), xlim=c(-1, 10),
parameter=c("muOfY","sigmaOfY"), add.legend="topright", main="Length of Stay",
cex.main=1.5, xpt=5, pcol="black", lcol="cyan", tgtcol="blue", bcol="orange",
cex.legend=1.5, cex.text = 2)

Shewhart control charts

Description

Graph X-bar charts, p-charts, and u-charts. This includes producing means center lines, 3-sigma upper and lower control limits. Users can also calculate values before and after an intervention to see if a change in the control process happened. Values are returned in a data frame.

Usage

## S3 method for class 'control'
plot(
  x,
  y = NULL,
  xlim = NULL,
  ylim = NULL,
  xlab = NULL,
  ylab = NULL,
  main = NULL,
  lwd = NULL,
  col = NULL,
  iname = NULL,
  icol = "black",
  trend = FALSE,
  trcol = "gray",
  tgt = NULL,
  tgtcol = "gray",
  tpline = NULL,
  tpcol = NULL,
  cex = 1,
  cex.lab = NULL,
  cex.axis = NULL,
  cex.main = NULL,
  cex.text = NULL,
  x.axis = NULL,
  y.axis = NULL,
  round.c = NULL,
  ...
)
## S3 method for class 'control'
plot(
  x,
  y = NULL,
  xlim = NULL,
  ylim = NULL,
  xlab = NULL,
  ylab = NULL,
  main = NULL,
  lwd = NULL,
  col = NULL,
  iname = NULL,
  icol = "black",
  trend = FALSE,
  trcol = "gray",
  tgt = NULL,
  tgtcol = "gray",
  tpline = NULL,
  tpcol = NULL,
  cex = 1,
  cex.lab = NULL,
  cex.axis = NULL,
  cex.main = NULL,
  cex.text = NULL,
  x.axis = NULL,
  y.axis = NULL,
  round.c = NULL,
  ...
)

Arguments

x

control object.

y

not currently used, default is NULL.

xlim

specify plot's x-axis limits with a 2 element numeric vector.

ylim

specify plot's y-axis limits with a 2 element numeric vector.

xlab

a character vector label for the x-axis.

ylab

a character vector label for the y-axis.

main

the main title of the plot.

lwd

select the line width.

col

a 2 element character vector to specify the 1) center line and 2) both control limit colors. Defaults to, if nothing selected, c("blue", "red").

iname

intervention name text as a single character vector. Defaults to "Intervention" if nothing is selected.

icol

specify intervention line color as a single character vector. Defaults to "black" if nothing is selected.

trend

a logical vector that adds an ordinary least squares trend line (i.e., simple linear regression line) when trend=TRUE. Default is FALSE.

trcol

select a color for the trend line. Default is 'gray'.

tgt

specify 1 or more values on the y-axis of where to add one or more horizontal target lines. Default is NULL.

tgtcol

select one or multiple colors for one or multiple target lines. Default is 'gray'.

tpline

add one or more time point vertical lines using x-axis values. Default is NULL (i.e., no lines).

tpcol

specify a color for the time point line, tpline. Default is NULL.

cex

A numerical value giving the amount by which plotting text and symbols should be magnified relative to the default of 1.

cex.lab

The magnification to be used for x and y labels relative to the current setting of cex.

cex.axis

The magnification to be used for axis annotation relative to the current setting of cex.

cex.main

The magnification to be used for main titles relative to the current setting of cex.

cex.text

The magnification to be used for the iname text added into the plot relative to the current setting of 1.

x.axis

a vector of unique character or numeric values that makes up x-axis values to replace the time variable values. This will be most helpful if you prefer current calendar months/years instead of values starting at 1 (e.g., x.axis= sort(unique(data$Year)) for 1900-1999, not 1-100). Must have equal lengths for unique x.axis values and replaced values (i.e., nrow(x)). Default is NULL.

y.axis

a vector of unique character or numeric values that makes up y-axis values to replace the outcome variable values. This will be most helpful if your outcome needs to be converted such as rate per 1,000 patient days (e.g., y.axis= seq(min(x$HAI)*1000, max(x$HAI)*1000, length.out=nrow(x))). Must have equal lengths for unique y.axis values and replaced values (i.e., nrow(x)). Default is NULL.

round.c

an integer indicating the number of decimal places when rounding numbers such as for y.axis. Default is 2.

...

additional arguments.

Value

plot of Shewhart control charts: X-bar charts, p-charts, and u-charts with 3-sigma control limits.

References

Ryan, T. (2011). Statistical Methods for Quality Improvement, Third Edition. John Wiley & Sons, Inc. ISBN: 978-0-470-59074-4.

Examples

## Hospital LOS and readmissions ##
# X-bar chart
spc_x <- control(x="los", time="month", data=hosprog, type="x", n.equal=TRUE)
# Basic X-bar chart
plot(spc_x)

# p-chart, using only the numerator (i.e., y=NULL). Specify unequal sample sizes
spc_p <- control(x="rdm30", time="month", data=hosprog, type="p", n.equal=FALSE)
# p-chart, adding target and time point lines
plot(spc_p, tgt=c(0,.25), tgtcol="green", ylim=c(0,0.4), tpline=c(4,8),
tpcol= c("yellow","black"))

# u-chart for infection rates with an intervention
spc_u <- control(x="HAI", y="PatientDays", time="Month", data=infections,
type="u", n.equal=FALSE, intervention=22)
# u-chart with trend lines, various graphing options, x.axis start at 2nd year
# and y.axis changed to show HAIs per 1,000 patient days
plot(spc_u, main="u-Chart: HAI per 1,000 Patient Days Pre/Post Intervention",
col=c("green","dodgerblue"), trend=TRUE, trcol="red", x.axis=c((1:41+12)), round.c=1,
y.axis=seq(min(spc_u$HAI)*1000, max(spc_u$HAI)*1000, length.out=nrow(spc_u)),
xlab="Months (starting at year 2)", icol="gray", lwd=2, cex=2,
cex.axis=1.1, cex.main=1.25, cex.text=1.25)
## Hospital LOS and readmissions ##
# X-bar chart
spc_x <- control(x="los", time="month", data=hosprog, type="x", n.equal=TRUE)
# Basic X-bar chart
plot(spc_x)

# p-chart, using only the numerator (i.e., y=NULL). Specify unequal sample sizes
spc_p <- control(x="rdm30", time="month", data=hosprog, type="p", n.equal=FALSE)
# p-chart, adding target and time point lines
plot(spc_p, tgt=c(0,.25), tgtcol="green", ylim=c(0,0.4), tpline=c(4,8),
tpcol= c("yellow","black"))

# u-chart for infection rates with an intervention
spc_u <- control(x="HAI", y="PatientDays", time="Month", data=infections,
type="u", n.equal=FALSE, intervention=22)
# u-chart with trend lines, various graphing options, x.axis start at 2nd year
# and y.axis changed to show HAIs per 1,000 patient days
plot(spc_u, main="u-Chart: HAI per 1,000 Patient Days Pre/Post Intervention",
col=c("green","dodgerblue"), trend=TRUE, trcol="red", x.axis=c((1:41+12)), round.c=1,
y.axis=seq(min(spc_u$HAI)*1000, max(spc_u$HAI)*1000, length.out=nrow(spc_u)),
xlab="Months (starting at year 2)", icol="gray", lwd=2, cex=2,
cex.axis=1.1, cex.main=1.25, cex.text=1.25)

Decision Curve Analysis plots and regression model classification graphs on sensitivity and specificity

Description

Graph decide class object's model classification results as well as Decision Curve analysis output for net benefit and interventions saved.

Usage

## S3 method for class 'decide'
plot(
  x,
  y = NULL,
  main = NULL,
  xlab = NULL,
  ylab = NULL,
  xlim = NULL,
  ylim = NULL,
  lwd = NULL,
  bcol = NULL,
  lcol = NULL,
  add.legend = NULL,
  legend = NULL,
  cex = 1,
  cex.lab = NULL,
  cex.axis = NULL,
  cex.main = NULL,
  cex.legend = NULL,
  round.c = 2,
  ...
)
## S3 method for class 'decide'
plot(
  x,
  y = NULL,
  main = NULL,
  xlab = NULL,
  ylab = NULL,
  xlim = NULL,
  ylim = NULL,
  lwd = NULL,
  bcol = NULL,
  lcol = NULL,
  add.legend = NULL,
  legend = NULL,
  cex = 1,
  cex.lab = NULL,
  cex.axis = NULL,
  cex.main = NULL,
  cex.legend = NULL,
  round.c = 2,
  ...
)

Arguments

x

decide object.

y

type of plot to display. Select either 'nb', 'is', or 'cl' for a decision curve analysis 'net benefit' and 'interventions saved', or a model classification (e.g., sensitivity and specificity) according to a selected threshold. Net benefit and interventions saved display results for specific percentiles found between the 1st and 99th percentiles. Default is 'nb'.

main

the main title of the plot.

xlab

a character vector label for the x-axis.

ylab

a character vector label for the y-axis.

xlim

specify plot's x-axis limits with a 2 element numeric vector.

ylim

specify plot's y-axis limits with a 2 element numeric vector. When y='is', the y-axis labels are in integers (e.g., 1 to 100) but the actual scale is in decimals (0.0 to 1.0), please note when making y-axis limits.

lwd

select the line width.

bcol

a multiple element character vector of length == 2 to specify the bar, band, or block colors that are the shading of the true and false classification regions of the plot (e.g., true-positive and false-negative). When y='cl', the first color represents 'true' and the second color is for 'negative'. Default is null, if none selected, the colors are c('green', 'red').

lcol

a single or multiple element character vector to specify the line color(s). When y='nb', select up to 3 colors in this order for model, 'net benefit', 'all treated', and 'none treated' line colors. When y='is', select 1 color for the interventions saved line. And when y='cl', select 1 color to represent the selected threshold.

add.legend

add a legend by selecting the location as "bottomright", "bottom", "bottomleft", "left", "topleft", "top", "topright", "right", "center". Default is no legend produced if nothing is selected.

legend

a character vector of length >= 2 to appear when y='nb' and y='cl' with legend description.

cex

A numerical value giving the amount by which plotting text and symbols should be magnified relative to the default of 1.

cex.lab

The magnification to be used for x and y labels relative to the current setting of cex.

cex.axis

The magnification to be used for axis annotation relative to the current setting of cex.

cex.main

The magnification to be used for main titles relative to the current setting of cex.

cex.legend

The magnification to be used for the legend added into the plot relative to the current setting of 1.

round.c

an integer indicating the number of decimal places when rounding numbers y='multi' and y='target'. Default is 2.

...

additional arguments.

Value

plots of model classification, net benefit, and interventions saved.

References

Vickers, A. & Elkin, E. (2006). Decision Curve Analysis: A Novel Method for Evaluating Prediction Models. Society for Medical Decision Making, 26, 6, 565-574. https://doi.org/10.1177/0272989X06295361

Examples

## Predicting car engine shape type, v or straight  ##
# run the model
car_m1 <- assess(formula=vs ~ hp + am, data=mtcars, regression="logistic")
# create a decide object, enter the model name and a threshold on the logit scale
d1 <- decide(x=car_m1, threshold= -0.767)
#Plot the classification results
plot(x=d1, y= "cl")

#Plot the net benefit
plot(x=d1, y= "nb")

#Plot the interventions saved
plot(x=d1, y= "is")
## Predicting car engine shape type, v or straight  ##
# run the model
car_m1 <- assess(formula=vs ~ hp + am, data=mtcars, regression="logistic")
# create a decide object, enter the model name and a threshold on the logit scale
d1 <- decide(x=car_m1, threshold= -0.767)
#Plot the classification results
plot(x=d1, y= "cl")

#Plot the net benefit
plot(x=d1, y= "nb")

#Plot the interventions saved
plot(x=d1, y= "is")

Confidence interval graphs for group class objects

Description

Confidence interval graphs for group class objects

Usage

## S3 method for class 'group'
plot(
  x,
  y = "group",
  order = "alpha",
  gcol = "blue",
  gband = FALSE,
  pcol = "red",
  overall = FALSE,
  ocol = "gray",
  oband = FALSE,
  tgt = NULL,
  tcol = "gray",
  tpline = NULL,
  tpcol = "gray",
  xlim = NULL,
  ylim = NULL,
  main = NULL,
  xlab = NULL,
  ylab = NULL,
  lwd = 1,
  adj.alpha = 0.4,
  cex = 1,
  cex.axis = 1,
  cex.lab = 1,
  cex.main = 1,
  cex.text = 1,
  round.c = 2,
  name = FALSE,
  abbrv = 5,
  roll.x = "Stop",
  ...
)
## S3 method for class 'group'
plot(
  x,
  y = "group",
  order = "alpha",
  gcol = "blue",
  gband = FALSE,
  pcol = "red",
  overall = FALSE,
  ocol = "gray",
  oband = FALSE,
  tgt = NULL,
  tcol = "gray",
  tpline = NULL,
  tpcol = "gray",
  xlim = NULL,
  ylim = NULL,
  main = NULL,
  xlab = NULL,
  ylab = NULL,
  lwd = 1,
  adj.alpha = 0.4,
  cex = 1,
  cex.axis = 1,
  cex.lab = 1,
  cex.main = 1,
  cex.text = 1,
  round.c = 2,
  name = FALSE,
  abbrv = 5,
  roll.x = "Stop",
  ...
)

Arguments

x

group object.

y

type of confidence interval object, specify either 'group', 'time', or 'roll'.

order

specify confidence interval object order as 'alpha' or 'numeric' for alphabetical or numerical ordering in the 'group' graph.

gcol

pick confidence interval line colors for groups in the 'group' graph. Default is 'blue'.

gband

logical TRUE or FALSE that indicates whether group lines have confidence bands for trend over time results. Default is FALSE.

pcol

select point color for 'group' only confidence intervals. Default is 'red'.

overall

logical TRUE or FALSE that indicates whether to include the overall sample confidence intervals (i.e., not each group). Default is FALSE.

ocol

indicate the optional overall line color. Default is 'gray' when overall=TRUE.

oband

logical TRUE or FALSE that indicates whether to add an overall confidence band. Default is FALSE.

tgt

specify 1 or more values on the x-axis of where to add a target line when y='group'. Or 1 or more values on the y-axis of where to add a target line when y='time' or 'roll'. Default is NULL.

tcol

select a color for the target line. Default is 'gray'.

tpline

add one or more time point vertical line(s) using x-axis values when y='time' or y='roll'. Default is NULL.

tpcol

specify a color for the time point line, tpline. Default is NULL.

xlim

specify plot's x-axis limits with a 2 value vector.

ylim

specify plot's y-axis limits with a 2 value vector.

main

the main title of the plot.

xlab

a vector label for the x-axis.

ylab

a vector label for the y-axis.

lwd

select the line width. Default is 1.

adj.alpha

factor modifying the opacity alpha of the confidence interval bands, in the range of 0 to 1. Default is 0.4.

cex

A numerical value giving the amount by which plotting text and symbols should be magnified relative to the default of 1.

cex.axis

The magnification to be used for axis annotation relative to the current setting of cex. Default is 1.

cex.lab

The magnification to be used for x and y labels relative to the current setting of cex. Default is 1.

cex.main

The magnification to be used for main titles relative to the current setting of cex. Default is 1.

cex.text

The magnification to be used for the text added into the plot relative to the current setting of 1.

round.c

an integer indicating the number of decimal places to be used for rounding values. Default is 2.

name

logical TRUE or FALSE that indicates whether group names should be added to the 'time' or 'roll' plots. Default is FALSE.

abbrv

the minimum length of the abbreviations. Default is 5.

roll.x

one of 2 character options to graph rolling average x-axis values as either the 'Start' or 'Stop' of the time period. Selecting 'Start' indicates the time point on the x-axis as the first month (e.g., roll=12 will have a graph that starts at 1 on the x-axis). The default is 'Stop' (e.g., roll=12 will have a graph that starts at 12 on the x-axis).

...

additional arguments.

Value

plot of group level confidence intervals, including estimates over time periods.

Examples

#Simple graph for confidence intervals using the t-distribution
gr1 <- group(x="program", y="los", z="month", data=hosprog, dist="t",
increment=3, rolling=6)
# Group level confidence intervals
plot(x=gr1, y="group", order="numeric", lwd=4, gcol= "blue", pcol="red",
overall=TRUE, oband=TRUE, ocol="gray", tcol="green", tgt=4.5,
cex=1, cex.axis=1, cex.lab=1, cex.text=2,
cex.main=1.25, name=TRUE, adj.alpha=.2)
#Trend plots over time in the 3 month increments (i.e., quarters)
plot(x=gr1, y="time", lwd=4, gcol=c("red", "blue"), gband=TRUE, overall=TRUE,
  oband=TRUE, ocol="gray", tcol="green", tgt=4, tpline=3,
  tpcol="yellow", name=TRUE, cex.axis=1, cex.lab=1, cex.text=2,
  cex.main=1.25, adj.alpha=.3)
#Plot for rolling 6-month averages
plot(x=gr1, y="roll", lwd=4, gcol=c("red", "blue"), gband=TRUE, overall=TRUE,
  oband=TRUE, ocol="gray", tcol="green", tgt=4, tpline=c(4,6),
  tpcol="yellow", name=TRUE, cex.axis=1, cex.lab=1, cex.text=2,
  cex.main=1.25, adj.alpha=.3)

  # Helpful note: You can plot aggregated data if you used the argument asis=TRUE in group(),
#Simple graph for confidence intervals using the t-distribution
gr1 <- group(x="program", y="los", z="month", data=hosprog, dist="t",
increment=3, rolling=6)
# Group level confidence intervals
plot(x=gr1, y="group", order="numeric", lwd=4, gcol= "blue", pcol="red",
overall=TRUE, oband=TRUE, ocol="gray", tcol="green", tgt=4.5,
cex=1, cex.axis=1, cex.lab=1, cex.text=2,
cex.main=1.25, name=TRUE, adj.alpha=.2)
#Trend plots over time in the 3 month increments (i.e., quarters)
plot(x=gr1, y="time", lwd=4, gcol=c("red", "blue"), gband=TRUE, overall=TRUE,
  oband=TRUE, ocol="gray", tcol="green", tgt=4, tpline=3,
  tpcol="yellow", name=TRUE, cex.axis=1, cex.lab=1, cex.text=2,
  cex.main=1.25, adj.alpha=.3)
#Plot for rolling 6-month averages
plot(x=gr1, y="roll", lwd=4, gcol=c("red", "blue"), gband=TRUE, overall=TRUE,
  oband=TRUE, ocol="gray", tcol="green", tgt=4, tpline=c(4,6),
  tpcol="yellow", name=TRUE, cex.axis=1, cex.lab=1, cex.text=2,
  cex.main=1.25, adj.alpha=.3)

  # Helpful note: You can plot aggregated data if you used the argument asis=TRUE in group(),

Plot of variable importance ranked by partial chi-square statistic

Description

Plots an importance class object. Produces a dot chart that places the predictor variable with the highest partial chi-square (Wald chi-square for individual coefficients) at the bottom. It is a metric of the partial chi-square minus its degrees of freedom (Harrell, 2015). Predictor variables with significant p-values at the 0.05 alpha are highlighted red. Consider graphical parameters of mar=c(4.2, 2, 3.5, 3) and oma = c(0, 0, 0, 3).

Usage

## S3 method for class 'importance'
plot(
  x,
  y,
  main = NULL,
  cex = NULL,
  pt.cex = NULL,
  pch = NULL,
  color = NULL,
  lcolor = NULL,
  ...
)
## S3 method for class 'importance'
plot(
  x,
  y,
  main = NULL,
  cex = NULL,
  pt.cex = NULL,
  pch = NULL,
  color = NULL,
  lcolor = NULL,
  ...
)

Arguments

x

importance object.

y

not currently used.

main

overall title for the plot, default is 'Variable Importance'.

cex

the character size to be used. Setting cex to a value smaller than 1 can be a useful way of avoiding label overlap. This sets the actual size, not a multiple of par('cex').

pt.cex

the cex to be applied to plotting symbols, default is 2.

pch

the plotting character or symbol to be used, default is 19.

color

the color to be used for points and labels when there are significant results. Default is 'red'.

lcolor

the color(s) to be used for the horizontal lines. Default is 'gray'.

...

additional arguments.

Value

plot of variable importance, significant variables highlighted in red.

References

Harrell, F. E., Jr. (2016). Regression Modeling Strategies. Springer International Publishing. ISBN: 978-3-319-19424-0.

Examples

# OLS regression
plot(importance(assess(mpg ~ hp + wt + cyl, data=mtcars, regression= "ols")$model))

# logistic regression
plot(importance(assess(vs~mpg+wt+hp, data=mtcars, regression= "logistic")$model))
# OLS regression
plot(importance(assess(mpg ~ hp + wt + cyl, data=mtcars, regression= "ols")$model))

# logistic regression
plot(importance(assess(vs~mpg+wt+hp, data=mtcars, regression= "logistic")$model))

Print alpha results

Description

Formats alpha class results to display summary statistics of scale information. These include the overall alpha, scale mean and standard deviation, item statistics, scale statistics if an item is removed from the scale, and the total sample size.

Usage

## S3 method for class 'alpha'
print(x, ...)
## S3 method for class 'alpha'
print(x, ...)

Arguments

x

alpha object from Cronbach's alpha calculation.

...

Additional arguments.

Value

formatted alpha results.

Examples

print(alpha(items=c("i1","i2","i3","i4","i5"), data=cas))

print(alpha(items=c("i1","i2","i3","i4","i5"), data=cas))

Print interpret object

Description

Formats interpretations from interpret class objects. Provides simple interpretations of regression coefficients and Cronbach's alpha. Print specific model interpretations (or run all), returned in sentence and paragraph formats.

Usage

## S3 method for class 'interpret'
print(x, ...)
## S3 method for class 'interpret'
print(x, ...)

Arguments

x

interpret object.

...

Additional arguments.

Value

formatted interpret object results.

Examples


#Cronbach's alpha
print(interpret(alpha(items=c("i1","i2","i3","i4","i5"), data=cas)))

#' # interpret a standard linear (OLS) regression
hos1 <- assess(formula=survey ~ program + month, data=hosprog, regression= "ols")
print(interpret(hos1)$model)

# interpret a differences-in-differences model
hos2 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two", newdata=TRUE)
interpret(hos2)$did

# interpret an interrupted time series model
hos3 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)
interpret(hos3)$its

#Cronbach's alpha
print(interpret(alpha(items=c("i1","i2","i3","i4","i5"), data=cas)))

#' # interpret a standard linear (OLS) regression
hos1 <- assess(formula=survey ~ program + month, data=hosprog, regression= "ols")
print(interpret(hos1)$model)

# interpret a differences-in-differences model
hos2 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two", newdata=TRUE)
interpret(hos2)$did

# interpret an interrupted time series model
hos3 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)
interpret(hos3)$its

USA's unemployment rate between 1929 and 2024

Description

USA's unemployment rate between 1929 and 2024

Usage

unemployment
unemployment

Format

`unemployment`

An artificial data frame with 96 rows and 5 columns:

Year: Calendar year as an integer from 1929 to 2024.
rate: Unemployment rate defined as the proportion of the US Labor Market that were unemployed.
event: Notable events in the United States that may have impacted the unemployment rate. Other than notable events (or unspecified) listed as 'other'.
year: Integer for the number of years for 1929 to 2024, ranging from 1 to 96.
usa: Value of 1 to indicate data for the USA.

...

Source

unemployment is an artificial data frame of reasonable estimates made entirely for the purpose of demonstrating interrupted time series with many interruptions. For precise rates, please see a reliable source.