Streamlining spectral data processing and modeling for spectroscopy applications
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

360 lines
31 KiB

<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Soil and plant spectroscopic model building and prediction • simplerspec</title>
<!-- jquery --><script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js" integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo=" crossorigin="anonymous"></script><!-- Bootstrap --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/css/bootstrap.min.css" integrity="sha256-bZLfwXAP04zRMK2BjiO8iu9pf4FbLqX6zitd+tIvLhE=" crossorigin="anonymous">
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/js/bootstrap.min.js" integrity="sha256-nuL8/2cJ5NDSSwnKD8VqreErSWHtnEP9E7AySL+1ev4=" crossorigin="anonymous"></script><!-- bootstrap-toc --><link rel="stylesheet" href="bootstrap-toc.css">
<script src="bootstrap-toc.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous">
<!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- pkgdown --><link href="pkgdown.css" rel="stylesheet">
<script src="pkgdown.js"></script><meta property="og:title" content="Soil and plant spectroscopic model building and prediction">
<meta property="og:description" content="Functions that cover
reading of spectral data, outlier removal,
spectral preprocessing, calibration sampling, PLS regression
using caret, and model diagnostic statistics and plots.">
<meta name="robots" content="noindex">
<!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body data-spy="scroll" data-target="#toc">
<div class="container template-home">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="index.html">simplerspec</a>
<span class="version label label-danger" data-toggle="tooltip" data-placement="bottom" title="In-development version">0.1.0.9001</span>
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="reference/index.html">Reference</a>
</li>
<li>
<a href="news/index.html">Changelog</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/philipp-baumann/simplerspec/" class="external-link">
<span class="fab fa-github fa-lg"></span>
</a>
</li>
</ul>
</div>
<!--/.nav-collapse -->
</div>
<!--/.container -->
</div>
<!--/.navbar -->
</header><div class="row">
<div class="contents col-md-9">
<div class="section level1">
<div class="page-header"><h1 id="simplerspec-">simplerspec <img src="reference/figures/simplerspec-logo.png" align="right" width="250"><a class="anchor" aria-label="anchor" href="#simplerspec-"></a>
</h1></div>
<!-- badges: start -->
<div class="section level2">
<h2 id="short-description">Short description<a class="anchor" aria-label="anchor" href="#short-description"></a>
</h2>
<p>The simplerspec package aims to facilitate spectral and additional data handling and model development for spectroscopy applications such as infrared soil spectroscopy. Different helper functions are designed to create a data and modeling workflow. Data inputs and outputs are stored in common S3 <code>R</code> objects (<code>lists</code> and <code>data frames</code>), using in addition <a href="https://rdatatable.gitlab.io/data.table/" class="external-link"><code>data.table</code></a> and <a href="https://tibble.tidyverse.org/index.html" class="external-link"><code>tibble</code></a> extensions. The functions are built to work in a pipeline and cover commonly used procedures for spectral model development and application.</p>
</div>
<div class="section level2">
<h2 id="installation">Installation<a class="anchor" aria-label="anchor" href="#installation"></a>
</h2>
<p>The newest version of the package is available on this GitHub repository. If you find bugs you are highly welcome to report issues (write me an <a href="mailto:info@spectral-cockpit.space">email</a> or create an <a href="https://github.com/philipp-baumann/simplerspec/issues" class="external-link">issue</a>). You can install {simplerspec} from GitHub or directly from the r-universe.</p>
<div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="co"># option 1</span></span>
<span><span class="kw">if</span> <span class="op">(</span><span class="op">!</span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">require</a></span><span class="op">(</span><span class="st"><a href="https://remotes.r-lib.org" class="external-link">"remotes"</a></span><span class="op">)</span><span class="op">)</span> <span class="fu"><a href="https://rdrr.io/r/utils/install.packages.html" class="external-link">install.packages</a></span><span class="op">(</span><span class="st">"remotes"</span><span class="op">)</span></span>
<span><span class="fu">remotes</span><span class="fu">::</span><span class="fu"><a href="https://remotes.r-lib.org/reference/install_github.html" class="external-link">install_github</a></span><span class="op">(</span><span class="st">"philipp-baumann/simplerspec"</span><span class="op">)</span></span>
<span><span class="co"># option 2</span></span>
<span><span class="fu"><a href="https://rdrr.io/r/utils/install.packages.html" class="external-link">install.packages</a></span><span class="op">(</span><span class="st">"simplerspec"</span>,</span>
<span> repos <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"https://philipp-baumann.r-universe.dev"</span>, <span class="st">"https://cloud.r-project.org"</span><span class="op">)</span><span class="op">)</span></span></code></pre></div>
</div>
<div class="section level2">
<h2 id="key-features">Key features<a class="anchor" aria-label="anchor" href="#key-features"></a>
</h2>
<p>The current version of the package features among others the following functions:</p>
<ol style="list-style-type: decimal">
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/read-opus-universal.R" class="external-link"><code>read_opus_univ()</code></a>: Read spectra and metadata from Bruker OPUS binary files into R list</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/gather-spc.R" class="external-link"><code>gather_spc()</code></a>: Gather spectra and metadata from list into a tibble object (list-columns)</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/resample-spc.R" class="external-link"><code>resample_spc()</code></a>: Resample spectra to new wavenumber intervals</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/average-spc.R" class="external-link"><code>average_spc()</code></a>: Average spectra for replicate scans</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/preprocess-spc.R" class="external-link"><code>preprocess_spc()</code></a>: Perform pre-processing of spectra</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/select-spc.R" class="external-link"><code>select_spc_vars()</code></a>: Select every <code>n</code>-th spectral variable and corresponding X-unit values.</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/join-chem-spectra.R" class="external-link"><code>join_spc_chem()</code></a>: Join chemical and spectral data sets by <code>sample_id</code>
</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/plot-spc-extended.R" class="external-link"><code>plot_spc_ext()</code></a>: Extended spectral plotting; e.g. group spectra using different panels or color spectra based on chemical reference values to explore trends.</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/pls-modeling.R" class="external-link"><code>fit_pls()</code></a>: Perform model tuning and evaluation based on Partial Least Squares (PLS) regression</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/select-ref-spectra.R" class="external-link"><code>select_ref_spc()</code></a>: Select a set of reference samples to measured by traditional analysis methods when no a priori sample data except spectra are available (based on Kennard-Stones sampling)</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/predict-spc.R" class="external-link"><code>predict_from_spc()</code></a>: Predict multiple chemical properties from a list of calibrated models and new soil spectra</li>
<li>
<a href="https://github.com/philipp-baumann/simplerspec/blob/master/R/utils-stats.R" class="external-link"><code>assess_multimodels()</code></a>: Assess model performance given multiple pairs of predicted and measured variables.</li>
</ol>
</div>
<div class="section level2">
<h2 id="cheatsheet">Cheatsheet<a class="anchor" aria-label="anchor" href="#cheatsheet"></a>
</h2>
<p><a href="https://github.com/philipp-baumann/spc-proc-concepts/blob/master/img/simplerspec_cheatsheet_crop.pdf" class="external-link"><img src="https://github.com/philipp-baumann/spc-proc-concepts/blob/master/img/simplerspec_cheatsheet.png" width="630"></a></p>
</div>
<div class="section level2">
<h2 id="motivation-and-key-concepts">Motivation and key concepts<a class="anchor" aria-label="anchor" href="#motivation-and-key-concepts"></a>
</h2>
<p>Many R packages are available to do tasks in spectral modeling such as pre-processing of spectral data. The motivation to create this package was:</p>
<ol style="list-style-type: decimal">
<li>Avoid repetition of code in model development (common source of errors).</li>
<li>Provide a reproducible data analysis workflow for FT-IR spectroscopy.</li>
<li>R packages are an ideal way to organize and share R code.</li>
<li>Make soil FT-IR spectroscopy modeling accessible to people that have basic R knowledge.</li>
<li>Provide an integrated data-model framework that features tidy data structures designed for both user-friendly printing and efficient data processing.</li>
</ol>
<p>This package builds mainly upon functions from the following R packages:</p>
<ul>
<li>
<a href="https://cran.r-project.org/web/packages/prospectr/index.html" class="external-link"><code>prospectr</code></a>: Various utilities for pre-processing and sample selection based on spectroscopic data. An introduction to the package with examples can be found <a href="https://l-ramirez-lopez.r-universe.dev/articles/prospectr/prospectr.html" class="external-link">here</a>.</li>
<li>
<code>plyr</code> and <a href="https://dplyr.tidyverse.org" class="external-link"><code>dplyr</code></a>: Fast data manipulation tools with an unified interface.</li>
<li>
<code>ggplot2</code>: Alternative plotting system for R, based on the grammar of graphics. See <a href="https://ggplot2.tidyverse.org" class="external-link">here</a>.</li>
<li>
<code>caret</code>: Classification and regression training. A set of functions that attempt to streamline the process for creating predictive models. See <a href="https://topepo.github.io/caret/" class="external-link">here</a> for details.</li>
</ul>
<p>Consistent and reproducible data and metadata management is an important prerequisite for spectral model development. Therefore, simplerspec functions are based on storing spectral data and related data in R data structures which keep related data in rows. Every row representing an observation contains data related to a single spectral measurement. Simplerspec functions uses tibble data frames as principal data structures because they allow to store lists within the well-known data frame structures. Lists are flexible data structures and can e.g. contain other lists, vectors, data.frames, or matrices.</p>
<p>List-columns features provided within the tibble framework are an excellent base to work with functional programming tools in R, which allows to efficiently write code. Simplerspec internally uses popular functional programming extension tools provided by the <a href="https://purrr.tidyverse.org/" class="external-link"><code>purrr</code></a> package for processing and transforming spectra. For learning more, I would recommend <a href="https://github.com/jennybc/purrr-tutorial/tree/gh-pages" class="external-link">this nice purrr list-column tutorial</a> provided by Jenny Brian. Further, simplerspec well integrates with the data processing API provided by the dplyr package, which makes spectroscopic analysis tidy and easy to understand.</p>
</div>
<div class="section level2">
<h2 id="example-workflow">Example workflow<a class="anchor" aria-label="anchor" href="#example-workflow"></a>
</h2>
<p>Bruker FTIR spectrometers produce binary files in the OPUS format that can contain different types of spectra and many parameters such as instrument type and settings that were used at the time of data acquisition and internal processing (e.g. Fourier transform operations). Basically, the entire set of setup measurement parameters, selected spectra, supplementary metadata such as the time of measurement are written into OPUS binary files. In contrast to simple text files that contain only plain text with a defined character encoding, binary files can contain any type of data represented as sequences of bytes (a single byte is sequence of 8 bits and 1 bit either represents 0 or 1).</p>
<p>Simplerspec comes with reader function <code><a href="reference/read_opus_univ.html">read_opus_univ()</a></code> that is intended to be a universal Bruker OPUS file reader that extracts spectra and key metadata from files. Usually, one is mostly interested to extract the final absorbance spectra (shown as <em>AB</em> in the OPUS viewer software).</p>
<div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="co"># Load simplerspec package for spectral model development wrapper functions</span></span>
<span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://github.com/philipp-baumann/simplerspec" class="external-link">simplerspec</a></span><span class="op">)</span></span>
<span><span class="co"># Load tidyverse packages: loads packages frequently used for data manipulation,</span></span>
<span><span class="co"># data tidying, import, and plotting</span></span>
<span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://tidyverse.tidyverse.org" class="external-link">tidyverse</a></span><span class="op">)</span></span>
<span></span>
<span><span class="co">################################################################################</span></span>
<span><span class="co">## Part 1: Read and pre-process spectra, read chemical data, and join</span></span>
<span><span class="co">## spectral and chemical data sets</span></span>
<span><span class="co">################################################################################</span></span>
<span></span>
<span><span class="co">## Read spectra in list ========================================================</span></span>
<span></span>
<span><span class="co"># List of OPUS binary spectra files</span></span>
<span><span class="va">lf</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/list.files.html" class="external-link">dir</a></span><span class="op">(</span><span class="st">"data/spectra/soilspec_eth_bin"</span>, full.names <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span></span>
<span></span>
<span><span class="co"># Read spectra from files into R list</span></span>
<span><span class="va">spc_list</span> <span class="op">&lt;-</span> <span class="fu"><a href="reference/read_opus_univ.html">read_opus_univ</a></span><span class="op">(</span>fnames <span class="op">=</span> <span class="va">lf</span>, extract <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"spc"</span><span class="op">)</span><span class="op">)</span></span>
<span><span class="co"># Returns messages:</span></span>
<span><span class="co">#&gt; Extracted spectra data from file: &lt;BF_lo_01_soil_cal.0&gt;</span></span>
<span><span class="co">#&gt; Extracted spectra data from file: &lt;BF_lo_01_soil_cal.1&gt;</span></span>
<span><span class="co">#&gt; Extracted spectra data from file: &lt;BF_lo_01_soil_cal.2&gt;</span></span>
<span><span class="co">#&gt; Extracted spectra data from file: &lt;BF_lo_02_soil_cal.0&gt;</span></span>
<span><span class="co">#&gt; ...</span></span></code></pre></div>
<p>Pipes can make R code more readable and allows step-wise data processing when developing spectral models. The pipe operator (<code>%&gt;%</code>, called “then”) is a new operator in R that was introduced with the magrittr package. It facilitates readability of code and avoids to type intermediate objects. The basic behavior of the pipe operator is that the object on the left hand side is passed as the first argument to the function on the right hand side. When loading the tidyverse packages, the pipe operator is attached to the current R session. More details can be found <a href="https://magrittr.tidyverse.org" class="external-link">here</a>.</p>
<p>The model development process can be quickly coded as the example below illustrates:</p>
<div class="sourceCode" id="cb3"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="co">## Spectral data processing pipe ===============================================</span></span>
<span></span>
<span><span class="va">soilspec_tbl</span> <span class="op">&lt;-</span> <span class="va">spc_list</span> <span class="op"><a href="reference/pipe.html">%&gt;%</a></span></span>
<span> <span class="co"># Gather list of spectra data into tibble data frame</span></span>
<span> <span class="fu"><a href="reference/gather_spc.html">gather_spc</a></span><span class="op">(</span><span class="op">)</span> <span class="op"><a href="reference/pipe.html">%&gt;%</a></span> </span>
<span> <span class="co"># Resample spectra to new wavenumber interval</span></span>
<span> <span class="fu"><a href="reference/resample_spc.html">resample_spc</a></span><span class="op">(</span>wn_lower <span class="op">=</span> <span class="fl">500</span>, wn_upper <span class="op">=</span> <span class="fl">3996</span>, wn_interval <span class="op">=</span> <span class="fl">2</span><span class="op">)</span> <span class="op"><a href="reference/pipe.html">%&gt;%</a></span></span>
<span> <span class="co"># Average replicate scans per sample_id</span></span>
<span> <span class="fu"><a href="reference/average_spc.html">average_spc</a></span><span class="op">(</span><span class="op">)</span> <span class="op"><a href="reference/pipe.html">%&gt;%</a></span></span>
<span> <span class="co"># Preprocess spectra using Savitzky-Golay first derivative with a window size</span></span>
<span> <span class="co"># of 21 points</span></span>
<span> <span class="fu"><a href="reference/preprocess_spc.html">preprocess_spc</a></span><span class="op">(</span>select <span class="op">=</span> <span class="st">"sg_1_w21"</span><span class="op">)</span></span>
<span> </span>
<span><span class="va">soilspec_tbl</span></span>
<span><span class="co"># A tibble: 284 x 11</span></span>
<span><span class="co">#&gt; unique_id file_id sample_id</span></span>
<span><span class="co">#&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;</span></span>
<span><span class="co">#&gt; 1 BF_lo_01_soil_cal.0_2015-11-06 14:34:10 BF_lo_01_soil_cal.0 BF_lo_01_soil_cal</span></span>
<span><span class="co">#&gt; 2 BF_lo_01_soil_cal.1_2015-11-06 14:38:14 BF_lo_01_soil_cal.1 BF_lo_01_soil_cal</span></span>
<span><span class="co">#&gt; 3 BF_lo_01_soil_cal.2_2015-11-06 14:40:55 BF_lo_01_soil_cal.2 BF_lo_01_soil_cal</span></span>
<span><span class="co">#&gt; 4 BF_lo_02_soil_cal.0_2015-11-06 17:27:55 BF_lo_02_soil_cal.0 BF_lo_02_soil_cal</span></span>
<span><span class="co">#&gt; 5 BF_lo_02_soil_cal.1_2015-11-06 17:30:19 BF_lo_02_soil_cal.1 BF_lo_02_soil_cal</span></span>
<span><span class="co">#&gt; 6 BF_lo_02_soil_cal.2_2015-11-06 17:32:47 BF_lo_02_soil_cal.2 BF_lo_02_soil_cal</span></span>
<span><span class="co">#&gt; 7 BF_lo_03_soil_cal.0_2015-11-09 11:32:55 BF_lo_03_soil_cal.0 BF_lo_03_soil_cal</span></span>
<span><span class="co">#&gt; 8 BF_lo_03_soil_cal.1_2015-11-09 11:35:26 BF_lo_03_soil_cal.1 BF_lo_03_soil_cal</span></span>
<span><span class="co">#&gt; 9 BF_lo_03_soil_cal.2_2015-11-09 11:38:08 BF_lo_03_soil_cal.2 BF_lo_03_soil_cal</span></span>
<span><span class="co">#&gt; 10 BF_lo_04_soil_cal.0_2015-11-06 10:36:13 BF_lo_04_soil_cal.0 BF_lo_04_soil_cal</span></span>
<span><span class="co">#&gt; # ... with 274 more rows, and 8 more variables: spc &lt;list&gt;, wavenumbers &lt;list&gt;,</span></span>
<span><span class="co">#&gt; # metadata &lt;list&gt;, spc_rs &lt;list&gt;, wavenumbers_rs &lt;list&gt;, spc_mean &lt;list&gt;,</span></span>
<span><span class="co">#&gt; # spc_pre &lt;list&gt;, xvalues_pre &lt;list&gt;</span></span>
<span> </span>
<span></span>
<span><span class="co">## Read chemical reference data and join with spectral data ====================</span></span>
<span></span>
<span><span class="co"># Read chemical reference analysis data</span></span>
<span><span class="va">soilchem_tbl</span> <span class="op">&lt;-</span> <span class="fu">read_csv</span><span class="op">(</span>file <span class="op">=</span> <span class="st">"data/soilchem/soilchem_yamsys.csv"</span><span class="op">)</span></span>
<span><span class="co">#&gt; Parsed with column specification:</span></span>
<span><span class="co">#&gt; cols(</span></span>
<span><span class="co">#&gt; .default = col_double(),</span></span>
<span><span class="co">#&gt; sample_ID = col_character(),</span></span>
<span><span class="co">#&gt; country = col_character(),</span></span>
<span><span class="co">#&gt; site = col_character(),</span></span>
<span><span class="co">#&gt; material = col_character(),</span></span>
<span><span class="co">#&gt; site_comb = col_character()</span></span>
<span><span class="co">#&gt; )</span></span>
<span><span class="co">#&gt; See spec(...) for full column specifications.</span></span>
<span></span>
<span><span class="co"># Join spectra tibble and chemical reference analysis tibble</span></span>
<span><span class="va">spec_chem</span> <span class="op">&lt;-</span> <span class="fu"><a href="reference/join_spc_chem.html">join_spc_chem</a></span><span class="op">(</span></span>
<span> spc_tbl <span class="op">=</span> <span class="va">soilspec_tbl</span>, chem_tbl <span class="op">=</span> <span class="va">soilchem_tbl</span>, by <span class="op">=</span> <span class="st">"sample_id"</span><span class="op">)</span></span>
<span><span class="co">#&gt; Joining, by = "sample_id"</span></span>
<span></span>
<span><span class="co">################################################################################</span></span>
<span><span class="co">## Part 2: Run PLS regression models for different soil variables</span></span>
<span><span class="co">################################################################################</span></span>
<span></span>
<span><span class="co"># Example Partial Least Squares (PLS) Regression model for total Carbon (C)</span></span>
<span><span class="co"># Use repeated k-fold cross-validation to tune the model (choose optimal </span></span>
<span><span class="co"># number of PLS components) and estimate model performance on hold-out </span></span>
<span><span class="co"># predictions of the finally chosen model (model assessment).</span></span>
<span><span class="co"># This allows to use the entire set for both model building and evaluation;</span></span>
<span><span class="co"># recommended for small data sets</span></span>
<span><span class="va">pls_C</span> <span class="op">&lt;-</span> <span class="fu"><a href="reference/fit_pls.html">fit_pls</a></span><span class="op">(</span></span>
<span> <span class="co"># remove rows with NA in the data</span></span>
<span> spec_chem <span class="op">=</span> <span class="va">spec_chem</span><span class="op">[</span><span class="op">!</span><span class="fu"><a href="https://rdrr.io/r/base/NA.html" class="external-link">is.na</a></span><span class="op">(</span><span class="va">spec_chem</span><span class="op">$</span><span class="va">C</span><span class="op">)</span>, <span class="op">]</span>,</span>
<span> response <span class="op">=</span> <span class="va">C</span>,</span>
<span> evaluation_method <span class="op">=</span> <span class="st">"resampling"</span>,</span>
<span> tuning_method <span class="op">=</span> <span class="st">"resampling"</span>,</span>
<span> resampling_method <span class="op">=</span> <span class="st">"rep_kfold_cv"</span>,</span>
<span> pls_ncomp_max <span class="op">=</span> <span class="fl">7</span> <span class="co"># maximum number of PLS components tested during tuning</span></span>
<span><span class="op">)</span> </span></code></pre></div>
</div>
<div class="section level2">
<h2 id="projects-using-simplerspec">Projects using simplerspec<a class="anchor" aria-label="anchor" href="#projects-using-simplerspec"></a>
</h2>
<ul>
<li><a href="https://sae-interactive-data.ethz.ch/simplerspec.drc/" class="external-link">Spectral platform for soil samples of the Democratic Republic of Congo</a></li>
</ul>
</div>
<div class="section level2">
<h2 id="package-help">Package help<a class="anchor" aria-label="anchor" href="#package-help"></a>
</h2>
<p>After successfully installing simplerspec, you can use the R build-in help using <code>?simplerspec::&lt;fun_name&gt;</code></p>
</div>
<div class="section level2">
<h2 id="like-it">Like it?<a class="anchor" aria-label="anchor" href="#like-it"></a>
</h2>
<p><a href="https://www.buymeacoffee.com/specphil" class="external-link"><img src="https://www.buymeacoffee.com/assets/img/custom_images/orange_img.png" alt="“Buy Me A Coffee”"></a></p>
</div>
<div class="section level2">
<h2 id="credits">Credits<a class="anchor" aria-label="anchor" href="#credits"></a>
</h2>
<p>I would like to thank the following people for the inspiration by concepts, code and packages:</p>
<ul>
<li>Antoine Stevens and Leonardo Ramirez-Lopez for their contributions to the <a href="https://l-ramirez-lopez.r-universe.dev/prospectr" class="external-link">prospectr package</a> and the <em>Guide to Diffuse Reflectance Spectroscopy &amp; Multivariate Calibration</em>
</li>
<li>Andrew Sila, Tomislav Hengl, and Thomas Terhoeven-Urselmans for the <a href="https://github.com/cran/soil.spec/blob/master/R/read.opus.R" class="external-link"><code>read.opus()</code></a> function from the <a href="https://cran.r-project.org/src/contrib/Archive/soil.spec/" class="external-link">soil.spec</a> package developed at ICRAF.</li>
<li>
<a href="https://hadley.nz" class="external-link">Hadley Wickham</a> for his work and concepts on data science within R</li>
<li>
<a href="https://github.com/topepo" class="external-link">Max Kuhn</a> for the creation of the caret package and for his excellent teaching materials on <a href="https://link.springer.com/book/10.1007/978-1-4614-6849-3" class="external-link">applied predictive modeling</a>
</li>
</ul>
</div>
</div>
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="pkgdown-sidebar">
<div class="links">
<h2 data-toc-skip>Links</h2>
<ul class="list-unstyled">
<li><a href="https://github.com/philipp-baumann/simplerspec/" class="external-link">Browse source code</a></li>
<li><a href="https://github.com/philipp-baumann/simplerspec" class="external-link">Report a bug</a></li>
</ul>
</div>
<div class="license">
<h2 data-toc-skip>License</h2>
<ul class="list-unstyled">
<li><a href="LICENSE.html">Full license</a></li>
<li><small><a href="https://www.r-project.org/Licenses/GPL-2" class="external-link">GPL-2</a></small></li>
</ul>
</div>
<div class="citation">
<h2 data-toc-skip>Citation</h2>
<ul class="list-unstyled">
<li><a href="authors.html#citation">Citing simplerspec</a></li>
</ul>
</div>
<div class="developers">
<h2 data-toc-skip>Developers</h2>
<ul class="list-unstyled">
<li>Philipp Baumann <br><small class="roles"> Author, maintainer </small> </li>
</ul>
</div>
<div class="dev-status">
<h2 data-toc-skip>Dev status</h2>
<ul class="list-unstyled">
<li><a href="https://zenodo.org/badge/latestdoi/67121732" class="external-link"><img src="https://zenodo.org/badge/67121732.svg" alt="DOI"></a></li>
<li><a href="https://opensource.org/licenses/MIT" class="external-link"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT"></a></li>
<li><a href="https://philipp-baumann.r-universe.dev/simplerspec" class="external-link"><img src="https://philipp-baumann.r-universe.dev/badges/simplerspec?scale=1&amp;color=pink&amp;style=round" alt="runiverse-package simplerspec"></a></li>
</ul>
</div>
</div>
</div>
<footer><div class="copyright">
<p></p>
<p>Developed by Philipp Baumann.</p>
</div>
<div class="pkgdown">
<p></p>
<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p>
</div>
</footer>
</div>
</body>
</html>