Streamlining spectral data processing and modeling for spectroscopy applications
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

205 lines
11 KiB

<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width, initial-scale=1.0"><title>Calibration sampling, and random forest model tuning and evaluation — fit_rf • simplerspec</title><!-- jquery --><script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js" integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo=" crossorigin="anonymous"></script><!-- Bootstrap --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/css/bootstrap.min.css" integrity="sha256-bZLfwXAP04zRMK2BjiO8iu9pf4FbLqX6zitd+tIvLhE=" crossorigin="anonymous"><script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/js/bootstrap.min.js" integrity="sha256-nuL8/2cJ5NDSSwnKD8VqreErSWHtnEP9E7AySL+1ev4=" crossorigin="anonymous"></script><!-- bootstrap-toc --><link rel="stylesheet" href="../bootstrap-toc.css"><script src="../bootstrap-toc.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous"><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous"><!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- pkgdown --><link href="../pkgdown.css" rel="stylesheet"><script src="../pkgdown.js"></script><meta property="og:title" content="Calibration sampling, and random forest model tuning and evaluation — fit_rf"><meta property="og:description" content="Perform calibration sampling and use selected
calibration set for model tuning"><meta name="robots" content="noindex"><!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]--></head><body data-spy="scroll" data-target="#toc">
<div class="container template-reference-topic">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">simplerspec</a>
<span class="version label label-danger" data-toggle="tooltip" data-placement="bottom" title="In-development version">0.1.0.9001</span>
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav"><li>
<a href="../reference/index.html">Reference</a>
</li>
<li>
<a href="../news/index.html">Changelog</a>
</li>
</ul><ul class="nav navbar-nav navbar-right"><li>
<a href="https://github.com/philipp-baumann/simplerspec/" class="external-link">
<span class="fab fa-github fa-lg"></span>
</a>
</li>
</ul></div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
</header><div class="row">
<div class="col-md-9 contents">
<div class="page-header">
<h1>Calibration sampling, and random forest model tuning and evaluation</h1>
<small class="dont-index">Source: <a href="https://github.com/philipp-baumann/simplerspec/blob/HEAD/R/pls-modeling.R" class="external-link"><code>R/pls-modeling.R</code></a></small>
<div class="hidden name"><code>fit_rf.Rd</code></div>
</div>
<div class="ref-description">
<p>Perform calibration sampling and use selected
calibration set for model tuning</p>
</div>
<div id="ref-usage">
<div class="sourceCode"><pre class="sourceCode r"><code><span><span class="fu">fit_rf</span><span class="op">(</span></span>
<span> <span class="va">spec_chem</span>,</span>
<span> <span class="va">response</span>,</span>
<span> variable <span class="op">=</span> <span class="cn">NULL</span>,</span>
<span> evaluation_method <span class="op">=</span> <span class="st">"test_set"</span>,</span>
<span> validation <span class="op">=</span> <span class="cn">NULL</span>,</span>
<span> split_method <span class="op">=</span> <span class="st">"ken_stone"</span>,</span>
<span> <span class="va">ratio_val</span>,</span>
<span> ken_sto_pc <span class="op">=</span> <span class="fl">2</span>,</span>
<span> pc <span class="op">=</span> <span class="cn">NULL</span>,</span>
<span> invert <span class="op">=</span> <span class="cn">TRUE</span>,</span>
<span> tuning_method <span class="op">=</span> <span class="st">"resampling"</span>,</span>
<span> resampling_seed <span class="op">=</span> <span class="fl">123</span>,</span>
<span> cv <span class="op">=</span> <span class="cn">NULL</span>,</span>
<span> ntree_max <span class="op">=</span> <span class="fl">500</span>,</span>
<span> print <span class="op">=</span> <span class="cn">TRUE</span>,</span>
<span> env <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/sys.parent.html" class="external-link">parent.frame</a></span><span class="op">(</span><span class="op">)</span></span>
<span><span class="op">)</span></span></code></pre></div>
</div>
<div id="arguments">
<h2>Arguments</h2>
<dl><dt>spec_chem</dt>
<dd><p>Tibble that contains spectra, metadata and chemical
reference as list-columns. The tibble to be supplied to <code>spec_chem</code> can
be generated by the <code>join_chem_spc() function</code></p></dd>
<dt>response</dt>
<dd><p>Response variable as symbol or name
(without quotes, no character string). The provided response symbol needs to be
a column name in the <code>spec_chem</code> tibble.</p></dd>
<dt>variable</dt>
<dd><p>Depreciated and replaced by <code>response</code></p></dd>
<dt>evaluation_method</dt>
<dd><p>Character string stating evaluation method.
Either <code>"test_set"</code> (default) or <code>"resampling"</code>. <code>"test_set"</code>
will split the data into a calibration (training) and validation (test) set,
and evaluate the final model by predicting on the validation set.
If <code>"resampling"</code>, the finally selected model will be evaluated based
on the cross-validation hold-out predictions.</p></dd>
<dt>validation</dt>
<dd><p>Depreciated and replaced by <code>evaluation_method</code>.
Default is <code>TRUE</code>.</p></dd>
<dt>split_method</dt>
<dd><p>Method how to to split the data into a independent test
set. Default is <code>"ken_sto"</code>, which will select samples for calibration
based on Kennard-Stone sampling algorithm of preprocessed spectra. The
proportion of validation to the total number of samples can be specified
in the argument <code>ratio_val</code>.
<code>split_method = "random"</code> will create a single random split.</p></dd>
<dt>ratio_val</dt>
<dd><p>Ratio of validation (test) samples to
total number of samples (calibration (training) and validation (test)).</p></dd>
<dt>ken_sto_pc</dt>
<dd><p>Number of component used
for calculating mahalanobsis distance on PCA scores for computing
Kennard-Stone algorithm.
Default is <code>ken_sto_pc = 2</code>, which will use the first two PCA
components.</p></dd>
<dt>pc</dt>
<dd><p>Depreciated; renamed argument is <code>ken_sto_pc</code>.</p></dd>
<dt>invert</dt>
<dd><p>Logical</p></dd>
<dt>tuning_method</dt>
<dd><p>Character specifying tuning method. Tuning method
affects how caret selects a final tuning value set from a list of candidate
values. Possible values are <code>"resampling"</code>, which will use a
specified resampling method such as repeated k-fold cross-validation (see
argument <code>resampling_method</code>) and the generated performance profile
based on the hold-out predictions to decide on the final tuning values
that lead to optimal model performance. The value <code>"none"</code> will force
caret to compute a final model for a predefined canditate PLS tuning
parameter number of PLS components. In this case, the value
supplied by <code>ncomp_fixed</code>` is used to set model complexity at
a fixed number of components.</p></dd>
<dt>resampling_seed</dt>
<dd><p>Random seed (integer) that will be used for generating
resampling indices, which will be supplied to <code><a href="https://rdrr.io/pkg/caret/man/trainControl.html" class="external-link">caret::trainControl</a></code>.
This makes sure that modeling results are constant when re-fitting.
Default is <code>resampling_seed = 123</code>.</p></dd>
<dt>cv</dt>
<dd><p>Depreciated. Use <code>resampling_method</code> instead.</p></dd>
<dt>ntree_max</dt>
<dd><p>Maximum random forest trees
by caret::train. Caret will aggregate a performance profile using resampling
for an integer sequence from 1 to <code>ntree_max</code> trees.</p></dd>
<dt>print</dt>
<dd><p>Logical expression whether model evaluation graphs shall be
printed</p></dd>
<dt>env</dt>
<dd><p>Environment where function is evaluated. Default is
<code>parent.frame</code>.</p></dd>
</dl></div>
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="pkgdown-sidebar">
<nav id="toc" data-toggle="toc" class="sticky-top"><h2 data-toc-skip>Contents</h2>
</nav></div>
</div>
<footer><div class="copyright">
<p></p><p>Developed by Philipp Baumann.</p>
</div>
<div class="pkgdown">
<p></p><p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
</div>
</footer></div>
</body></html>