The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".
The following errata were submitted by our customers and approved as valid errors by the author or editor.
Version |
Location |
Description |
Submitted By |
Date submitted |
Date corrected |
|
Page 64-65
Between last paragraph on p. 64 and first paragraph on p. 65 |
Currently the text from p. 64 to p. 65 reads:
- Specify the type of model based on its mathematical structure: Such as linear regression, random forest, KNN, etc.Most often this reflects the software package that should be used, like Stan or glmnet. These are models in their own right, and parsnip provides consistent interfaces by using these as engines for modeling.
- When required, declare the mode of the model: The mode reflects the type of prediction outcome. For numeric outcomes, the mode is regression; for qualitative outcomes, it is classification.1 If a model algorithm can only address one type of prediction outcome, such as linear regression, the mode is already set.
Instead, that text should read as follows:
- Specify the type of model based on its mathematical structure (e.g., linear regression, random forest, KNN, etc).
- Specify the engine for fitting the model: Most often this reflects the software package that should be used, like Stan or glmnet. These are models in their own right, and parsnip provides consistent interfaces by using these as engines for modeling.
- When required, declare the mode of the model: The mode reflects the type of prediction outcome. For numeric outcomes, the mode is regression; for qualitative outcomes, it is classification.13 If a model algorithm can only address one type of prediction outcome, such as linear regression, the mode is already set.
|
Julia Silge |
Aug 26, 2022 |
|
|
Page 95
3rd paragraph, right below Fig 8.1 |
The text currently says:
Here we see that two neighborhoods have less than five properties in the training data (Landmark and Green Hills); in this case, no houses at all in the Landmark neighborhood were included in the training set.
The text should instead read:
Here we see that two neighborhoods have less than five properties in the training data (Landmark and Green Hills); in this case, no houses at all in the Landmark neighborhood were included in the testing set.
|
Julia Silge |
Aug 10, 2022 |
|
|
Page 130
Fig 10-2 |
The data point represented by "21" should be a circle (not a square) to represent it belonging to the first heldout fold. It is correct in Fig 10-3.
|
Julia Silge |
Sep 07, 2022 |
|
|
Page 133
1st paragraph in section "Leave-One-Out-Validation" |
In the first sentence, the phrase:
> where V is the number of data points in the training set
should be omitted.
|
Julia Silge |
Sep 19, 2022 |
|
Printed |
Page 209
1st paragraph after Fig 13-9 |
The sentence that currently reads:
> Any parameter set whose confidence interval includes zero would lack evidence that its performance is not statistically different from the best results.
should have the "not" removed. It should read:
> Any parameter set whose confidence interval includes zero would lack evidence that its performance is statistically different from the best results.
|
Julia Silge |
Sep 22, 2022 |
|
Printed |
Page 241-242
Code chunk at the end of page 241 and 242 |
This code chunk should not have been included:
grid_ctrl <-
control_grid(
save_pred = TRUE,
parallel_over = "everything",
save_workflow = TRUE
)
full_results_time <-
system.time(
grid_results <-
all_workflows %>%
workflow_map(seed = 1503, resamples = concrete_folds, grid = 25,
control = grid_ctrl, verbose = TRUE)
)
#> i 1 of 12 tuning: MARS
#> ✔ 1 of 12 tuning: MARS (12.5s)
#> i 2 of 12 tuning: CART
#> ✔ 2 of 12 tuning: CART (2m 37.6s)
#> i No tuning parameters. `fit_resamples()` will be attempted
#> i 3 of 12 resampling: CART_bagged
#> ✔ 3 of 12 resampling: CART_bagged (1m 33.9s)
#> i 4 of 12 tuning: RF
#> i Creating pre-processing data to finalize unknown parameter: mtry
#> ✔ 4 of 12 tuning: RF (7m 31.8s)
#> i 5 of 12 tuning: boosting
#> ✔ 5 of 12 tuning: boosting (11m 50.6s)
#> i 6 of 12 tuning: Cubist
#> ✔ 6 of 12 tuning: Cubist (10m 30.8s)
#> i 7 of 12 tuning: SVM_radial
#> ✔ 7 of 12 tuning: SVM_radial (3m 36s)
#> i 8 of 12 tuning: SVM_poly
#> ✔ 8 of 12 tuning: SVM_poly (37m 21.3s)
#> i 9 of 12 tuning: KNN
#> ✔ 9 of 12 tuning: KNN (4m 2.1s)
#> i 10 of 12 tuning: neural_network
#> ✔ 10 of 12 tuning: neural_network (8m 8.9s)
#> i 11 of 12 tuning: full_quad_linear_reg
#> ✔ 11 of 12 tuning: full_quad_linear_reg (5m 24.7s)
#> i 12 of 12 tuning: full_quad_KNN
#> ✔ 12 of 12 tuning: full_quad_KNN (17m 34.6s)
num_grid_models <- nrow(collect_metrics(grid_results, summarize = FALSE))
It should not have been rendered when printing.
|
Julia Silge |
Sep 23, 2022 |
|
Printed, PDF, ePub, Mobi, , Other Digital Version |
Page 244
Last paragraph of section |
The sentence that reads:
The example model screening with our concrete mixture data fits a total of 25,200 models.
should instead read:
The example model screening with our concrete mixture data fits a total of 12,600 models.
|
Julia Silge |
Sep 23, 2022 |
|
Printed, PDF, ePub, Mobi, , Other Digital Version |
Page 245
Last paragraph of the page |
The sentence which reads:
Overall, the racing approach estimated a total of 4,652 models, 18.46% of the full set of 25,200 models in the full grid. As a result, the racing approach was 4.7-fold faster.
should instead read:
Overall, the racing approach estimated a total of 2,335 models, 18.53% of the full set of 12,600 models in the full grid. As a result, the racing approach was 4.7-fold faster.
|
Julia Silge |
Sep 23, 2022 |
|