PyIRoGlass Documentation

Data Imports

class PyIRoGlass.SampleDataLoader(spectrum_path=None, chemistry_thickness_path=None)[source]

A class for efficiently loading and managing spectral data and associated metadata from CSV files. This includes spectral data from multiple samples, as well as related glass chemistry and thickness information. The class supports loading spectral data within a specified wavenumber range from a directory containing CSV files and loading chemistry and thickness data from a CSV file.

spectrum_path

The default list of paths where the spectral data CSV files are located.

Type:: list[str]

chemistry_thickness_path

The default file path to the CSV file containing glass chemistry and thickness data.

Type:: str

load_spectrum_directory(paths, wn_high, wn_low)[source]: Loads spectral data from CSV files within a specified wavenumber range, ensuring that the wavenumbers are in ascending order and skipping headers if present.

load_chemistry_thickness(chemistry_thickness_path)[source]: Loads glass chemistry and thickness data from a CSV file, setting the “Sample” column as the index.

load_all_data(paths, chemistry_thickness_path, wn_high, wn_low)[source]: Loads both spectral data from CSV files and chemistry thickness data from a CSV file.

class PyIRoGlass.VectorLoader[source]

A class for loading spectral data vectors, including principal components and wavenumbers, from specified NPZ files. It preloads vectors from default NPZ files upon instantiation.

base_path

The base directory path where NPZ files are located.

Type:: str

Baseline_PC_path

The path to the NPZ file containing the baseline principal components.

Type:: str

H2Om_PC_path

The path to the NPZ file containing the H2Om principal components.

Type:: str

wn_high

The upper limit of the wavenumber range to be considered.

Type:: int

wn_low

The lower limit of the wavenumber range to be considered.

Type:: int

Wavenumber

The array of wavenumbers loaded from the baseline NPZ file.

Type:: np.ndarray

Baseline_PC

The matrix of baseline principal components.

Type:: np.ndarray

H2Om_PC

The matrix of H2O-modified principal components.

Type:: np.ndarray

load_PC(file_name)[source]: Loads baseline principal components from NPZ.

load_wavenumber(file_name)[source]: Loads wavenumbers from NPZ.

Building-blocks functions for fitting baselines and peaks

PyIRoGlass.gauss(x, mu, sd, A=1)[source]

Return a Gaussian fit for the CO_3^{2-} doublet peaks at 1515 and 1430 cm^-1 peaks.

Parameters:

x (numeric) – The wavenumbers of interest.
mu (numeric) – The center of the peak.
sd (numeric) – The standard deviation (or width of peak).
A (numeric, optional) – The amplitude. Defaults to 1.

Returns:

The Gaussian fit.

Return type:

G (np.ndarray)

PyIRoGlass.linear(x, m, b)[source]

Calculate a linear offset for adjusting model data.

Parameters:

x (np.ndarray) – Input values.
m (float) – Tilt of the linear offset.
b (float) – Offset of the linear offset.

Returns:

Linear offset.

Return type:

tilt_offset (np.ndarray)

PyIRoGlass.carbonate(P, x, PCmatrix, H2Om1635_PCmatrix, Nvectors=5)[source]

The carbonate function takes in the inputs of fitting parameters P, wavenumbers x, principal component matrix, number of principal component vectors of interest. The function calculates the H2Om_{1635} peak, CO_3^{2-} Gaussian peaks, linear offset, and baseline. The function then returns the model data.

Parameters:

P (np.ndarray) – Fitting parameters including principal component and peak weights, Gaussian parameters, linear offset slope and intercept.
x (np.ndarray) – Wavenumbers of interest.
PCmatrix (matrix) – Principal components matrix.
H2Om1635_PCmatrix (matrix) – Principal components matrix for the 1635 peak.
Nvectors (int, optional) – Number of principal components vectors of interest. Default is 5.

Returns:

Model data for the carbonate spectra.

Return type:

model_data (np.ndarray)

PyIRoGlass.als_baseline(intensities, asymmetry_param=0.05, smoothness_param=500000.0, max_iters=15, conv_thresh=1e-05, verbose=False)[source]

Computes the asymmetric least squares baseline. http://www.science.uva.nl/~hboelens/publications/draftpub/Eilers_2005.pdf

Parameters:

intensities (array-like) – Input signal to baseline.
asymmetry_param (float, optional) – Asymmetry parameter (p) that controls the weights. Defaults to 0.05.
smoothness_param (float, optional) – Smoothness parameter (s) that controls the smoothness of the baseline.
max_iters (int, optional) – Maximum number of iterations.
conv_thresh (float, optional) – Convergence threshold. Iteration stops when the change in the weights < threshold.
verbose (bool, optional) – If True, prints the iteration number and convergence at each iteration.

Returns:

Baseline of the input signal.

Return type:

z (np.ndarray)

class PyIRoGlass.WhittakerSmoother(signal, smoothness_param, deriv_order=1)[source]

Implements the Whittaker smoother for smoothing a signal. http://www.science.uva.nl/~hboelens/publications/draftpub/Eilers_2005.pdf

Parameters:

signal (array-like) – Input signal to be smoothed.
smoothness_param (float) – Relative importance of smoothness of the predicted response.
deriv_order (int, optional) – Order of the derivative of the identity matrix.

y

Input signal to be smoothed.

Type:: array-like

upper_bands

Upper triangular bands of the matrix used for smoothing.

Type:: array-like

PyIRoGlass.NIR_process(data, wn_low, wn_high, peak)[source]

The NIR_process function processes Near-IR absorbance data and returns relevant information, such as the fit absorbance data, kriged data, peak height, and signal to noise ratio. The function first filters the data, then fits a baseline and subtracts it to determine the peak absorbance. Next, the data are kriged to further reduce noise and obtain peak height. Finally, the signal to noise ratio is calculated, and a warning is issued if the ratio is high. The function is used three times with different H2Om wavenumber ranges for uncertainty assessment.

Parameters:

data (pd.DataFrame) – A DataFrame of absorbance data.
wn_low (int) – The lower bound wavenumber for NIR H2Om or OH.
wn_high (int) – The higher bound wavenumber for NIR H2Om or OH.
peak (str) – The H2Om or OH peak of interest.

Returns:

A DataFrame of the absorbance data in the region of interest, median filtered data, baseline subtracted absorbance, and the subtracted peak. krige_out (pd.DataFrame): A DataFrame of kriged data output. PH_krige (float): The peak height obtained after kriging. STN (float): The signal to noise ratio.

Return type:

peak_fit (pd.DataFrame)

PyIRoGlass.MIR_process(data, wn_low, wn_high)[source]

The MIR_process function processes the Mid-IR H2Ot, 3550 peak and returns the fit absorbance data, kriged data, peak height. The function first filters the data, then fits a baseline and subtracts it to determine the peak absorbance. Next, the data are kriged to further reduce noise and obtain peak height. The function is used three times with different H2Ot wavenumber ranges for uncertainty assessment.

Parameters:

data (pd.DataFrame) – A DataFrame of absorbance data.
wn_low (int) – The lower bound wavenumber for MIR H2Ot, 3550.
wn_high (int) – The higher bound wavenumber for MIR H2Ot, 3550.

Returns:

A DataFrame of absorbance data, median filtered data, baseline subtracted absorbance, and the subtracted peak. krige_out (pd.DataFrame): A DataFrame of kriged data output. PH_krige (float): The peak height obtained after kriging.

Return type:

data_output (pd.DataFrame)

PyIRoGlass.MCMC(data, uncert, indparams, log, savefile)[source]

Runs Monte Carlo-Markov Chain and outputs the best fit parameters and standard deviations.

Parameters:

data (np.ndarray) – 1D array of data points.
uncert (np.ndarray) – 1D array of uncertainties on the data.
indparams (np.ndarray) – 2D array of independent parameters. Each row corresponds to one data point and each column corresponds to a different independent parameter.
log (bool, optional) – Flag indicating whether to log the output or not. Defaults to False.
savefile (str, optional) – File name to save output to. If None, output is not saved. Defaults to None.

Returns:

The output of the Monte Carlo-Markov Chain.

Return type:

mc3_output (mc3.output)

PyIRoGlass.calculate_baselines(dfs_dict, export_path)[source]

The calculate_baselines function processes a collection of spectral data to fit baselines to all H2O and CO2 related peaks in basaltic-andesitic spectra. The function inputs the dictionary of DataFrames that were created by the SampleDataLoader class and determines best-fit (and standard deviations) baselines, peak heights, peak locations, peak widths, and principal component vectors used to fit the spectra. These values are exported in a csv file and figures are created for each individual sample. Optionally, it generates visual representations of the spectra and analysis outcomes, alongside saving detailed analysis results in various file formats for further review.

Parameters:

dfs_dict (dict) – A dictionary where keys are file identifiers and values are DataFrames containing the spectral data for each sample. The spectral data is expected to have columns for wavenumbers and absorbance values.
export_path (str, None) – The directory path where the output files (CSVs, figures, logs, etc.) should be saved. If None, no files will be saved.

Returns:

A DataFrame of absorbance data,: median filtered data, baseline subtracted absorbance, and the subtracted peak.
failures (list): A list of file identifiers for which the analysis: failed, possibly due to data issues or processing errors.

Return type:

data_output (pd.DataFrame)

Functions for calculating concentrations

PyIRoGlass.beer_lambert(molar_mass, absorbance, density, thickness, epsilon)[source]

Applies the Beer-Lambert Law to calculate concentration from given inputs.

Parameters:

molar_mass (float) – The molar mass of the substance in grams per mole.
absorbance (float) – The absorbance of the substance, measured in optical density units.
density (float) – The density of the substance in grams per cubic centimeter.
thickness (float) – The thickness of the sample in centimeters.
epsilon (float) – The molar extinction coefficient, measured in liters per mole per centimeter.

Returns:

pandas series containing the concentration in wt.%.

Return type:

pd.Series

Notes

The Beer-Lambert Law states that the absorbance of a substance is proportional to its concentration. The formula for calculating concentration from absorbance is: concentration = (1e6*molar_mass*absorbance)/(density*thickness*epsilon)

# https://sites.fas.harvard.edu/~scphys/nsta/error_propagation.pdf

PyIRoGlass.beer_lambert_error(N, molar_mass, absorbance, sigma_absorbance, density, sigma_density, thickness, sigma_thickness, epsilon, sigma_epsilon)[source]

Applies a Monte Carlo simulation to estimate the uncertainty in concentration calculated using the Beer-Lambert Law.

Parameters:

N (int) – The number of Monte Carlo samples to generate.
molar_mass (float) – The molar mass of the substance in grams per mole.
absorbance (float) – The absorbance of the substance, measured in optical density units.
sigma_absorbance (float) – The uncertainty associated with the absorbance measurement.
density (float) – The density of the substance in grams per cubic centimeter.
sigma_density (float) – The uncertainty associated with the density measurement.
thickness (float) – The thickness of the sample in centimeters.
sigma_thickness (float) – The uncertainty associated with the sample thickness measurement.
epsilon (float) – The molar extinction coefficient, measured in liters per mole per centimeter.
sigma_epsilon (float) – The uncertainty associated with the molar extinction coefficient measurement.

Returns:

The estimated uncertainty in concentration in wt.%.

Return type:

float

Notes

The Beer-Lambert Law states that the absorbance of a substance is proportional to its concentration. The formula for calculating concentration from absorbance is: concentration = (1e6*molar_mass*absorbance)/(density*thickness*epsilon) This function estimates the uncertainty in concentration using a Monte Carlo simulation. Absorbance, density, and thickness are assumed to follow Gaussian distributions with means from the input values and standard deviations from the input uncertainties. The absorbance coefficient is assumed to follow a uniform distribution with a mean given by the input value and a range from the input uncertainty. The function returns the standard deviation of the resulting concentration distribution.

# https://astrofrog.github.io/py4sci/_static/Practice%20Problem%20-%20Monte-Carlo%20Error%20Propagation%20-%20Sample%20Solution.html

PyIRoGlass.calculate_concentrations(Volatile_PH, composition, thickness, export_path, N=500000, T=25, P=1)[source]

The calculate_concentrations function calculates the concentrations and uncertainties of volatile components (H2O peak (3550 cm^-1), molecular H2O peak (1635 cm^-1), and carbonate peaks (1515 and 1430 cm^-1) in a glass sample based on peak height data, sample composition, and wafer thickness. This function uses the Beer-Lambert law for absorbance to estimate concentrations and applies Monte Carlo simulations to quantify uncertainties. It iteratively adjusts for the effect of water content on glass density to improve accuracy.

Parameters:

Volatile_PH (pd.DataFrame) – DataFrame with columns for peak heights of total H2O (3550 cm^-1), molecular H2O (1635 cm^-1), and carbonate peaks (1515 and 1430 cm^-1). Each row represents a different sample.
composition (dictionary) – Dictionary with keys as oxide names and values as their weight percentages in the glass composition.
thickness (pd.DataFrame) – DataFrame with “Thickness” column indicating wafer thickness in micrometers (µm) for each sample.
N (int) – Number of Monte Carlo simulations to perform for uncertainty estimation. Default is 500,000.
T (int) – Temperature in Celsius at which the density is calculated. Default is 25°C.
P (int) – Pressure in bars at which the density is calculated. Default is 1 bar.

Returns:

DataFrame containing calculated: volatile concentrations and their uncertainties for each sample, including columns for mean and standard deviation of H2O and CO2 species concentrations. ALso contains density (“Density” column) and extinction coefficient (“epsilon” column) for each sample, providing insight into the properties of the glass under analysis.

Return type:

concentrations_df (pd.DataFrame)

Note

The function assumes that the input composition includes all relevant oxides and that the Volatile_PH DataFrame contains peak height data for all specified peaks. Errors in input data or missing values may affect the accuracy of the results.

Functions for calculating density, molar absorptivity

PyIRoGlass.calculate_density(composition, T, P, model='LS')[source]

The calculate_density function inputs the MI composition file and outputs the glass density at the temperature and pressure of analysis. The mole fraction is calculated. The total molar volume xivibari is determined from sum of the mole fractions of each oxide * partial molar volume at room temperature and pressure of analysis. The gram formula weight gfw is then determined by summing the mole fractions*molar masses. The density is finally determined by dividing gram formula weight by total molar volume.

Parameters:

composition (pd.DataFrame) – Dataframe containing oxide weight percentages for the glass composition.
T (float) – temperature at which the density is calculated (in Celsius)
P (float) – pressure at which the density is calculated (in bars)
model (str) – Choice of density model. “LS” for Lesher and Spera (2015), “IT” for Iacovino and Till (2019). Default is “LS”.

Returns:

DataFrame containing the oxide mole fraction for: the glass composition
density (float): glass density at room temperature and pressure: (in kg/m^3)

Return type:

mol (pd.DataFrame)

PyIRoGlass.calculate_epsilon(composition, T, P)[source]

The calculate_epsilon function computes the extinction coefficients and their uncertainties for various molecular species in a given MI or glass composition dataset.

Parameters:

composition (dictionary) – Dictionary containing the weight percentages of each oxide in the glass composition
T (int) – temperature at which the density is calculated (in Celsius)
P (int) – pressure at which the density is calculated (in bars)

Returns:

Dataframe of the mole fraction of each oxide in: the glass composition
density (pd.DataFrame): Dataframe of glass density at room temperature: and pressure (in kg/m^3)

Return type:

mol (pd.DataFrame)

Functions for plotting MCMC results

PyIRoGlass.plot_H2Om_OH(data, files, als_bls, ax_top=None, ax_bot=None)[source]

Visualizes NIR spectrum data along with baseline-subtracted and kriged peak fits for H2Om and OH components in spectral data. The function plots the original NIR spectrum and overlays the median-filtered peak fits and baseline fits for H2Om (5200 cm^-1) and OH- (4500 cm^-1) peaks on the top axis. The bottom axis displays the baseline-subtracted peaks and the kriged peak fits, emphasizing the isolated peak shapes and their respective signal-to-noise ratios.

Parameters:

data (pd.DataFrame) – DataFrame containing spectral data with “Absorbance” values indexed by wavenumber.
files (str) – The identifier for the current sample set being processed, used for titling the plot.
als_bls (dict) – A dictionary containing results from baseline fitting and peak analysis. Expected keys include “H2Om_5200_results” and “OH_4500_results”, each storing a list of result dictionaries from the ALS (Asymmetric Least Squares) processing for respective peaks.
ax_top (matplotlib.axes.Axes, optional) – The top subplot axis for plotting the NIR spectrum and peak fits. If not provided, a new figure with subplots will be created.
ax_bot (matplotlib.axes.Axes, optional) – The bottom subplot axis for plotting baseline-subtracted peaks and kriged peak fits. If not provided, it will be created along with ax_top in a new figure.

Returns:

This function does not return any value. It generates a plot visualizing the spectral data, baseline fits, and peak fits for the provided sample.

Return type:

None

Note

The function is designed to work within a larger analytical framework, where spectral preprocessing and peak fitting have been previously conducted. It expects specific data structures from these analyses as input. The function modifies the provided ax_top and ax_bot axes in place if they are provided; otherwise, it creates a new figure and axes for plotting.

PyIRoGlass.plot_H2Ot_3550(data, files, als_bls, ax=None)[source]

Plots Mid Infrared (MIR) spectral data along with the baseline and filtered peak for the total water (H2Ot) peak at 3550 cm^-1. This function visualizes the original MIR spectrum, the baseline fit, and the baseline-subtracted and filtered peak for H2Ot, highlighting the isolated peak shape and its characteristics.

Parameters:

data (pd.DataFrame) – DataFrame containing spectral data with “Absorbance” values indexed by wavenumber. Expected to cover the MIR range relevant for the H2Ot peak analysis.
files (str) – Identifier for the current sample set being processed, used for titling the plot.
als_bls (dict) – Dictionary containing results from baseline fitting and peak analysis. Expected to have a key “H2Ot_3550_results” which stores a list of dictionaries with results from the processing for the H2Ot peak.
ax (matplotlib.axes.Axes, optional) – Matplotlib axis object where the plot will be drawn. If None, a new figure and axis will be created.

Returns:

This function does not return any value. It generates a plot: visualizing the MIR spectral data, baseline fit, and filtered peak fit for the H2Ot peak at 3550 cm^-1.

Return type:

None

Note

The function is designed to work as part of a larger spectroscopic analysis workflow, where spectral data preprocessing, baseline fitting, and peak analysis have been previously conducted. It expects specific data structures from these analyses as input. If “ax” is not provided, the function creates a new figure and axis for plotting, which might not be ideal for integrating this plot into multi-panel figures or more complex visual layouts.

PyIRoGlass.derive_carbonate(data, files, mc3_output, export_path)[source]

Derives and saves carbonate region baseline and peak fits from FTIR spectral data using the PyIRoGlass model outputs. This function interpolates the spectral data to match the wavenumber spacing required by the model, calculates the baseline and peak fits for carbonate at 1430 and 1515 cm^-1, and water at 1635 cm^-1. It also generates an ensemble of baseline fits from the posterior distributions to estimate uncertainties. The results, including best fits and baselines, can optionally be saved to CSV files.

Parameters:

data (pd.DataFrame) – DataFrame containing FTIR spectral data with “Absorbance” values indexed by wavenumber.
file (str) – Spectrum sample name.
mc3_output (dict) – Dictionary containing results from PyIRoGlass fitting, including best fit parameters, posterior distributions, and other model outputs relevant for calculating peak fits.
savefile (bool) – If True, the function will save the best fits and baselines data to CSV files named according to the “file” parameter. If False, no files will be saved.

Returns:

DataFrame containing the original spectral: data, the complete spectrum fit from PyIRoGlass, individual peak fits for carbonate and water, and the calculated baseline. Indexed by wavenumber.
baselines (pd.DataFrame): DataFrame containing an ensemble of: baseline fits derived from the posterior distribution to represent uncertainty in the baseline estimation. Indexed by wavenumber.

Return type:

bestfits (pd.DataFrame)

PyIRoGlass.plot_carbonate(data, files, mc3_output, export_path, ax=None)[source]

Plots the FTIR spectrum along with baseline fits, and model fits for carbonate and water peaks. This function visualizes the original FTIR spectrum, the baseline determined by PCA, and model fits for CO_3^{2-} doublet, and the H2Om_1635 peak, generated from PyIRoGlass fitting.

Parameters:

data (pd.DataFrame) – DataFrame containing FTIR spectral data with “Absorbance” values indexed by wavenumber. Expected to cover the range relevant for carbonate and water peak analysis.
files (str) – Identifier for the current sample set being processed, used for titling the plot.
mc3_output (dict) – Dictionary containing results from PyIRoGlass fitting, including best fit parameters, standard deviations, posterior distributions, and other model outputs.
ax (matplotlib.axes.Axes, optional) – Matplotlib axis object where the plot will be drawn. If None, a new figure and axis will be created.

Returns:

This function does not return any value. It generates a plot: visualizing the FTIR spectral data, baseline fit, and model fits for carbonate and water peaks.

Return type:

None

Note

The function is designed to work as part of a larger spectroscopic analysis workflow, where spectral data preprocessing, baseline fitting, and peak modeling have been previously conducted. It expects specific data structures from these analyses as input. If “ax” is not provided, the function creates a new figure and axis for plotting, which might not be ideal for integrating this plot into multi-panel figures or more complex visual layouts.

PyIRoGlass.plot_modelfit(data, uncert, indparams, model, title, nbins=75, fignum=1400, savefile=None)[source]

Plot the binned dataset with given uncertainties and model curves as a function of indparams. In a lower panel, plot the residuals between the data and model.

Parameters:

data (1D np.ndarray) – Input data set.
uncert (1D np.ndarray) – One-sigma uncertainties of the data points.
indparams (1D np.ndarray) – Independent variable (X axis) of the data points.
model (1D np.ndarray) – Model of data.
nbins (int) – Number of bins in the output plot.
fignum (int) – Figure number.
savefile (bool) – Name of file to save the plot, if not none.

Returns:

Axes instance containing the marginal: posterior distributions.

Return type:

ax (matplotlib.axes.Axes)

PyIRoGlass.plot_trace(posterior, title, zchain=None, pnames=None, thinning=50, burnin=0, fignum=1000, savefile=None, fmt='.', ms=2.5, fs=12)[source]

Plot parameter trace MCMC sampling.

Parameters:

posterior (2D np.ndarray) – MCMC posterior sampling with dimension: [nsamples, npars].
zchain (1D np.ndarray) – Chain index for each posterior sample.
pnames (str) – Label names for parameters.
thinning (int) – Thinning factor for plotting (plot every thinning-th value).
burnin (int) – Thinned burn-in number of iteration (only used when zchain is not None).
fignum (int) – The figure number.
savefile (bool) – Name of file to save the plot if not none
fmt (string) – The format string for the line and marker.
ms (float) – Marker size.
fs (float) – Fontsize of texts.

Returns:

List of axes containing: the marginal posterior distributions.

Return type:

axes (1D list of matplotlib.axes.Axes)

Functions for determining thickness from reflectance FTIR spectra

PyIRoGlass.datacheck_peakdetect(x_axis, y_axis)[source]

Check and prepare data for peak detection analysis.

This function ensures that the input data for peak detection is in the correct format and that the x- and y-axis data are of equal lengths. If the x-axis data is not provided, it generates an x-axis as a range of integers with the same length as the y-axis data.

Parameters:

x_axis (np.ndarray or None) – A 1D array containing the x-axis data or None if an x-axis is to be generated.
y_axis (np.ndarray) – A 1D array containing the y-axis data.

Returns:

x_axis (np.ndarray): The checked or generated x-axis data. y_axis (np.ndarray): The checked y-axis data.

Return type:

Tuple containing two numpy arrays

Raises:

ValueError – If the lengths of the x-axis and y-axis data do not match.

Notes

This function comes from https://github.com/avhn/peakdetect. I pulled this for peak fitting, as the repository is no longer maintained and installation no longer works.

PyIRoGlass.peakdetect(y_axis, x_axis=None, lookahead=200, delta=0)[source]

Detect local maxima and minima in a signal based on a MATLAB script (http://billauer.co.il/peakdet.html).

This function identifies peaks by searching for values which are surrounded by lower or higher values for maxima and minima, respectively.

Parameters:

y_axis (list or np.ndarray) – A list or 1D numpy array containing the signal over which to find peaks.
x_axis (list or np.ndarray, optional) – A list or 1D numpy array whose values correspond to the y_axis list and is used in the return to specify the position of the peaks. If omitted, an index of the y_axis is used. Defaults to None.
lookahead (int, optional) – Distance to look ahead from a peak candidate to determine if it is the actual peak. Defaults to 200. A good value might be ‘(samples / period) / f’ where ‘4 >= f >= 1.25’.
delta (float, optional) – Specifies a minimum difference between a peak and the following points, before a peak may be considered a peak. Useful to hinder the function from picking up false peaks towards the end of the signal. To work well, delta should be set to delta >= RMSnoise x 5. Defaults to 0. When omitted, it can decrease the speed by 20%, but when used correctly, it can double the speed of the function.

Returns:

A tuple of two lists ([max_peaks, min_peaks]) containing the positive and negative peaks, respectively. Each element of the lists is a tuple of (position, peak_value). To get the average peak value, use: np.mean(max_peaks, 0)[1]. To unpack one of the lists into x, y coordinates, use: x, y = zip(max_peaks).

Notes

This function comes from https://github.com/avhn/peakdetect. I pulled this for peak fitting, as the repository is no longer maintained and installation no longer works.

PyIRoGlass.peakID(ref_spec, wn_high, wn_low, peak_heigh_min_delta, peak_search_width, savgol_filter_width, smoothing_wn_width=None, plotting=False, filename=None)[source]

Identifies peaks based on the peakdetect package which identifies local maxima and minima in noisy signals. Based on: https://github.com/avhn/peakdetect

Parameters:

ref_spec (pd.DataFrame) – A Pandas DataFrame indexed by wavenumber and containing absorbance values.
wn_high (int) – The upper wavenumber limit for the analysis.
wn_low (int) – The lower wavenumber limit for the analysis.
smoothing_wn_width (int) – The window size for the Savitzky-Golay smoothing filter. Default is None.
peak_heigh_min_delta (float) – Minimum difference between a peak and its neighboring points for it to be considered a peak. Default is 0.008.
peak_search_width (int) – The size of the region around each point to search for a peak. Default is 50.
plotting (bool) – Whether to create a plot of the spectrum with identified peaks and troughs. Default is False.
filename (str) – The name of the plot file. If None, the plot is not saved. Default is None.

Returns:

Peaks and troughs identified as local maxima and minima.

Return type:

Tuple containing the following elements

PyIRoGlass.calculate_thickness(n, positions)[source]

Calculates thicknesses of glass wafers based on the refractive index of the glass and the positions of the peaks or troughs in the FTIR spectrum.

Parameters:

n (float) – Refractive index of the glass.
positions (np.ndarray) – Array of positions of the peaks or troughs in the FTIR spectrum.

Returns:

Array of thicknesses of glass wafers.

Return type:

np.ndarray

PyIRoGlass.calculate_mean_thickness(dfs_dict, n, wn_high, wn_low, plotting=False, phaseol=True)[source]

Calculates thickness of glass wafers based on the refractive index of the glass and the positions of the peaks or troughs in the FTIR spectrum. Thicknesses for each interference fringe, starting at both the peaks and troughs of the fringes are determined. These thicknesses are then averaged over the interval of interest.

Parameters:

dfs_dict (dictionary) – dictionary containing FTIR data for each file
n (float) – refractive index of the glass
wn_high (float) – the high wavenumber cutoff for the analysis
wn_low (float) – the low wavenumber cutoff for the analysis
plotting (bool) – whether or not to plot the data and detected peaks and troughs

Returns:

a dataframe containing the thickness calculations for each file.

Return type:

ThickDF (pd.DataFrame)

Notes

smoothing_wn_width (float): Width of the Savitzky-Golay smoothing window, if not used, set to None. peak_heigh_min_delta (float): Minimum height difference between a peak and its surrounding points. peak_search_width (float): Distance (in wavenumbers) to look on either side of a peak to find the corresponding trough.

PyIRoGlass.reflectance_index(XFo)[source]

Calculates the reflectance index for a given forsterite composition. The reflectance index is calculated based on values from Deer, Howie, and Zussman, 3rd Edition.

Parameters:: XFo (float) – The mole fraction of forsterite in the sample.
Returns:: The calculated reflectance index.
Return type:: n (float)

Functions for molar absorptivity inversions

PyIRoGlass.inversion(x, y, sigma_x, sigma_y, intercept_zero=False)[source]

Perform a Newtonian inversion on a given set of x and y data.

Parameters:

x (np.ndarray) – A 1D array containing the x (composition) data.
y (np.ndarray) – A 1D array containing the y (absorbance coefficient) data.
sigma_x (float) – The standard deviation of the x (composition) data.
sigma_y (float) – The standard deviation of the y (absorbance coefficient) data.
intercept_zero (boolean) – Determines if the intercept is explicitly set to zero. Setting intercept_zero to True forces the intecept to zero. Setting intercept_zero to False allows the intercept to be normally estimated as part of the regression.

Returns:

mls (np.ndarray): A 1D array of the least squares estimate of: the coefficients.
mest_f (np.ndarray): A 1D array of the final estimate of the: coefficients.
covls (np.ndarray): The covariance matrix of the least squares: estimate.
covm_est_f (np.ndarray): The covariance matrix of the final: estimate.
covy_est_f (np.ndarray): The covariance matrix of the final: estimate of the absorbance coefficients.
x_pre (np.ndarray): A 1D array of the predicted x values based: on the final estimate.
y_pre (np.ndarray): A 1D array of the predicted absorbance: coefficient values based on the final estimate.
y_linear (np.ndarray): A 1D array of the predicted absorbance: coefficient values based on the linear regression estimate.

Return type:

Tuple containing the following elements

PyIRoGlass.least_squares(x, y, sigma_y)[source]

Perform a least squares regression on a given set of composition and absorbance coefficient data.

Parameters:

x (np.ndarray) – A 1D array containing the x (composition) data.
y (np.ndarray) – A 1D array containing the y (absorbance coefficient) data.
sigma_y (float) – The standard deviation of the absorbance coefficient data.

Returns:

mls (np.ndarray): A 1D array of the least squares estimate of the: coefficients.
covls (np.ndarray): The covariance matrix of the least squares: estimate.

Return type:

Tuple containing the following elements

PyIRoGlass.calculate_calibration_error(covariance_matrix)[source]

Calculate the calibration error based on the diagonal elements of a covariance matrix.

Parameters:: covariance_matrix (np.ndarray) – A covariance matrix.
Returns:: A float representing the calibration error.

PyIRoGlass.calculate_y_inversion(m, x)[source]

Calculate y values using coefficients and composition.

Parameters:

m (np.ndarray) – An array of coefficients.
x (np.ndarray) – An array of x (composition) data.

Returns:

A 1D array of calculated y values.

PyIRoGlass.calculate_SEE(residuals)[source]

Calculate the standard error of estimate given an array of residuals.

Parameters:: residuals (np.ndarray) – An array of residuals.
Returns:: A float representing the standard error of estimate.

PyIRoGlass.calculate_R2(true_y, pred_y)[source]

Calculate the coefficient of determination given actual and predicted values.

Parameters:

true_y (np.ndarray) – An array of actual values.
pred_y (np.ndarray) – An array of predicted values.

Returns:

A float representing the coefficient of determination.

PyIRoGlass.calculate_RMSE(residuals)[source]

Calculate the root mean squared error given an array of residuals.

Parameters:: residuals (np.ndarray) – An array of residuals.
Returns:: A float representing the root mean squared error.

PyIRoGlass.calculate_RRMSE(true_y, pred_y)[source]

Calculate the relative root mean squared error (RRMSE) between true and predicted values.

Parameters:

true_y (np.ndarray) – An array of true values.
pred_y (np.ndarray) – An array of predicted values.

Returns:

A float representing the relative root mean squared error.

PyIRoGlass.calculate_CCC(true_y, pred_y)[source]

Calculate the Concordance Correlation Coefficient (CCC) between true and predicted values.

Parameters:

true_y (np.ndarray) – An array of true values.
pred_y (np.ndarray) – An array of predicted values.

Returns:

A float representing the CCC.

PyIRoGlass.inversion_fit_errors(x, y, mest_f, covy_est_f)[source]

Calculate error metrics for a given set of data.

Parameters:

x (np.ndarray) – A 1D array containing the x (composition) data.
y – A 1D array containing the absorbance coefficient data.
mest_f (np.ndarray) – A 1D array of the final estimate of the coefficients.
covy_est_f (np.ndarray) – The covariance matrix of the final estimate of the absorbance coefficients.

Returns:

E_calib: A float representing the error in calibration. see_inv: A float representing the standard error of estimate. r2_inv: A float representing the coefficient of determination. rmse_inv: A float representing the root mean squared error.

Return type:

Tuple containing the following elements

PyIRoGlass.inversion_fit_errors_plotting(x, y, mest_f)[source]

Generate data for plotting the inversion fit along with its confidence and prediction intervals. This function calculates the fitted line using the inversion method, along with the corresponding confidence and prediction intervals for the regression. These intervals provide an estimation of the uncertainty associated with the regression fit and future predictions, respectively.

Parameters:

x (np.ndarray) – A 1D array containing the independent variable data.
y (np.ndarray) – A 1D array containing the dependent variable data.
mest_f (np.ndarray) – A 1D array containing the model parameters estimated by the inversion method.

Returns:

A 1D array of 100 linearly spaced values between: 0 and 1, representing the independent variable values for plotting the fitted line and intervals.
line_y (np.ndarray): The y values of the fitted line evaluated at: line_x, using the inversion method.
conf_lower (np.ndarray): The lower bound of the confidence interval for: the fitted line, evaluated at line_x.
conf_upper (np.ndarray): The upper bound of the confidence interval for: the fitted line, evaluated at line_x.
pred_lower (np.ndarray): The lower bound of the prediction interval for: the fitted line, evaluated at line_x.
pred_upper (np.ndarray): The upper bound of the prediction interval for: the fitted line, evaluated at line_x.

Return type:

line_x (np.ndarray)