Compares two data frames/tibbles (or two objects coercible to tibbles like matrices), optionally
ignoring row and column ordering, and returns TRUE
if both are equal, or FALSE
otherwise. If the latter is the case and quiet = FALSE
, information
about detected differences is printed to the console.
Usage
is_equal_df(
x,
y,
ignore_col_order = FALSE,
ignore_row_order = FALSE,
ignore_col_types = FALSE,
tolerance = NULL,
quiet = TRUE,
max_diffs = 10L,
return_waldo_compare = FALSE
)
Arguments
- x
The data frame / tibble to check for changes.
- y
The data frame / tibble that
x
should be checked against, i.e. the reference, so messages describe howx
is different toy
.- ignore_col_order
Whether or not to ignore the order of columns.
- ignore_row_order
Whether or not to ignore the order of rows.
- ignore_col_types
Whether or not to distinguish similar column types. Currently, if set to
TRUE
, this will convert factors to characters and integers to doubles before the comparison.- tolerance
If non-
NULL
, used as threshold for ignoring small floating point difference when comparing numeric vectors. Using any non-NULL
value will cause integer and double vectors to be compared based on their values, not their types, and will ignore the difference betweenNaN
andNA_real_
.It uses the same algorithm as
all.equal()
, i.e., first we generatex_diff
andy_diff
by subsettingx
andy
to look only locations with differences. Then we check thatmean(abs(x_diff - y_diff)) / mean(abs(y_diff))
(or justmean(abs(x_diff - y_diff))
ify_diff
is small) is less thantolerance
.- quiet
Whether or not to output detected differences between
x
andy
to the console.- max_diffs
Maximum number of differences shown. Only relevant if
quiet = FALSE
orreturn_waldo_compare = TRUE
. Setmax_diffs = Inf
to see all differences.- return_waldo_compare
Whether to return a character vector of class
waldo_compare
describing the differences betweenx
andy
instead ofTRUE
orFALSE
.
Value
If return_waldo_compare = FALSE
, a logical scalar indicating the result of the comparison. Otherwise a character vector of class
waldo_compare
describing the differences between x
and y
.
Details
Under the hood, this function relies on waldo::compare()
.
See also
Other data frame / tibble functions:
assert_cols()
,
reduce_df_list()
Examples
scramble <- function(x) x[sample(nrow(x)), sample(ncol(x))]
# by default, ordering of rows and columns matters...
pal::is_equal_df(x = mtcars,
y = scramble(mtcars))
#> [1] FALSE
# ...but those can be ignored if desired
pal::is_equal_df(x = mtcars,
y = scramble(mtcars),
ignore_col_order = TRUE)
#> [1] FALSE
pal::is_equal_df(x = mtcars,
y = scramble(mtcars),
ignore_row_order = TRUE)
#> [1] FALSE
# by default, `is_equal_df()` is sensitive to column type differences...
df1 <- data.frame(x = "a",
stringsAsFactors = FALSE)
df2 <- data.frame(x = factor("a"))
pal::is_equal_df(df1, df2)
#> [1] FALSE
# ...but you can request it to not make a difference between similar types
pal::is_equal_df(df1, df2,
ignore_col_types = TRUE)
#> [1] TRUE