This page was generated from docs/examples/transmission_ftir/PyIRoGlass_Transmission.ipynb. Interactive online version: Binder badge.

Python Notebook Download

Transmission FTIR Spectra

  • This Jupyter notebook provides an example workflow for processing transmission FTIR spectra through PyIRoGlass.

  • The Jupyter notebook and data can be accessed here: https://github.com/SarahShi/PyIRoGlass/blob/main/docs/examples/transmission_ftir/.

  • You need to have the PyIRoGlass PyPi package on your machine once. If you have not done this, please uncomment (remove the #) symbol and run the cell below.

[1]:
#!pip install PyIRoGlass

Load Python Packages and Data

Load Python Packages

[2]:
# Import packages

import PyIRoGlass as pig

from IPython.display import Image

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

pig.__version__
[2]:
'0.6.6'

Set paths to data

Update the path to the directory containing the spectra, as well as the paths to the chemistry and thickness data.

[3]:
spectrum_path = 'SPECTRA/'
print(spectrum_path)

chemistry_thickness_path = 'ChemThick.csv'
print(chemistry_thickness_path)
SPECTRA/
ChemThick.csv

Set desired output file directory name

Update the export_path to the desired output location.

[4]:
export_path = 'RESULTS'
print(export_path)
RESULTS

Load transmission FTIR spectra along with chemistry thickness data

We will use the class pig.SampleDataLoader to load all FTIR spectra and chemistry thickness data. The class takes the arguments:

  • spectrum_path: String or list path to the directory with spectral data

  • chemistry_thickness_path: String path to CSV file with glass chemistry and thickness data

and contains the methods:

  • load_spectrum_directory: Loads spectral data

  • load_chemistry_thickness: Loads chemistry thickness data

  • load_all_data: Loads spectral and chemistry thickness data

Here, we use load_all_data. This returns the outputs of:

  • dfs_dict: Dictionary where the keys are file identifiers and values are DataFrames with spectral data

  • chemistry: DataFrame of chemical data

  • thickness: DataFrame of thickness data

The file names from the spectra (what comes before the .CSV) are important when we load in melt compositions and thicknesses. Unique identifiers identify the same samples. Make sure that this ChemThick.CSV file has the same sample names as the loaded spectra.

[5]:
loader = pig.SampleDataLoader(spectrum_path=spectrum_path, chemistry_thickness_path=chemistry_thickness_path)
dfs_dict, chemistry, thickness = loader.load_all_data()

Let’s look at what dfs_dict, a dictionary of transmission FTIR spectra, looks like. Samples are identified by their file names (keys) and the wavenumber and absorbance data are stored as dataframes for each spectrum (values).

[6]:
dfs_dict
[6]:
{'AC4_OL49_021920_30x30_H2O_a':             Absorbance
 Wavenumber
 1000.917      6.000000
 1002.845      6.000000
 1004.774      3.212358
 1006.702      6.000000
 1008.631      3.550053
 ...                ...
 5490.577      0.658218
 5492.505      0.657289
 5494.434      0.657169
 5496.362      0.658473
 5498.291      0.660256

 [2333 rows x 1 columns],
 'AC4_OL53_101220_256s_30x30_a':             Absorbance
 Wavenumber
 1000.916      6.000000
 1002.845      2.809911
 1004.774      2.584419
 1006.702      2.808356
 1008.631      3.712419
 ...                ...
 5490.576      0.118337
 5492.505      0.117460
 5494.433      0.117553
 5496.362      0.117506
 5498.291      0.116924

 [2333 rows x 1 columns],
 'STD_D1010_012821_256s_100x100_a':             Absorbance
 Wavenumber
 1000.916      3.844739
 1002.845      3.630789
 1004.774      6.000000
 1006.702      6.000000
 1008.631      6.000000
 ...                ...
 5490.576      0.394656
 5492.505      0.395436
 5494.433      0.396272
 5496.362      0.396495
 5498.291      0.396368

 [2333 rows x 1 columns]}

Display chemistry, the DataFrame of glass compositions.

[7]:
chemistry
[7]:
SiO2 TiO2 Al2O3 Fe2O3 FeO MnO MgO CaO Na2O K2O P2O5
Sample
AC4_OL49_021920_30x30_H2O_a 52.34 1.04 17.92 1.93 7.03 0.20 3.63 7.72 4.25 0.78 0.14
AC4_OL53_101220_256s_30x30_a 47.95 1.00 18.88 2.04 7.45 0.19 4.34 9.84 3.47 0.67 0.11
STD_D1010_012821_256s_100x100_a 51.41 1.26 16.58 0.00 7.58 0.00 7.57 10.98 3.01 0.37 0.18

Display thickness, the DataFrame of wafer thicknesses.

[8]:
thickness
[8]:
Thickness Sigma_Thickness
Sample
AC4_OL49_021920_30x30_H2O_a 91.25 3
AC4_OL53_101220_256s_30x30_a 39.00 3
STD_D1010_012821_256s_100x100_a 231.00 3

See that the sample names of the spectra in dfs_dict, chemistry, and thickness all align.

We’re ready to roll – MCMC, here we come!

We use the function pig.calculate_baselines, which takes in two arguments:

  • dfs_dict: Dictionary where the keys are file identifiers and values are DataFrames with spectral data

  • export_path: Desired output directory name, or None to prevent figure generation

and returns:

  • Volatile_PH: DataFrame with peak heights and their associated uncertainties

  • failures: List of file identifiers for which analysis failed

Running this code will take a few seconds to minutes per spectra, as it is fitting \(\mathrm{10^6}\) baselines and peaks to your spectrum to sample uncertainty. If any samples fail, they will be returned in the list failures. It took 20 seconds to process 1 spectrum on my M2 Macbook Pro with 12 CPU cores. The same task takes about 2 minutes on Google Colab.

The function automatically saves this file as a CSV, so you have this information. We will also use this DataFrame to calculate concentrations.

[9]:
Volatile_PH, failures = pig.calculate_baselines(dfs_dict, export_path)

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
  Multi-core Markov-chain Monte Carlo (mc3).
  Version 3.2.1.
  Copyright (c) 2015-2026 Patricio Cubillos and collaborators.
  mc3 is open-source software under the MIT license (see LICENSE).
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
  Warning:
    The number of requested CPUs (4) is >= than the number of
available CPUs (2).  Enforced ncpu to 1.
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Least-squares best-fitting parameters:
  [ 1.10352344e+00 -1.06147069e+00  9.35733623e-01 -9.21114850e-02
  3.00000000e-01  1.42788938e+03  2.88207408e+01  1.09418304e-01
  1.51745559e+03  3.58474238e+01  1.07059543e-01  6.59012777e-01
  1.22070957e-01  1.53688190e-02 -3.11486464e-04  1.23428191e+00]

Yippee Ki Yay Monte Carlo!
Start MCMC chains  (Sat Mar 21 08:51:59 2026)

[:         ]  10.0% completed  (Sat Mar 21 08:52:04 2026)
Out-of-bound Trials:
[    0     0     0     0 10442     0     0     3     0     0     7     0
     0     0     0     0]
Best Parameters: (chisq=369.0163)
[ 1.10352344e+00 -1.06147069e+00  9.35733623e-01 -9.21114850e-02
  3.00000000e-01  1.42788938e+03  2.88207408e+01  1.09418304e-01
  1.51745559e+03  3.58474238e+01  1.07059543e-01  6.59012777e-01
  1.22070957e-01  1.53688190e-02 -3.11486464e-04  1.23428191e+00]

[::        ]  20.0% completed  (Sat Mar 21 08:52:09 2026)
Out-of-bound Trials:
[    0     7     0     3 19644     0     4     4     0   565     7     0
     0     0     0     0]
Best Parameters: (chisq=369.0163)
[ 1.10352344e+00 -1.06147069e+00  9.35733623e-01 -9.21114850e-02
  3.00000000e-01  1.42788938e+03  2.88207408e+01  1.09418304e-01
  1.51745559e+03  3.58474238e+01  1.07059543e-01  6.59012777e-01
  1.22070957e-01  1.53688190e-02 -3.11486464e-04  1.23428191e+00]
Gelman-Rubin statistics for free parameters:
[1.05923458 1.06227532 1.00785053 1.05102257 1.0270572  1.01732271
 1.05585339 1.01483024 1.05315693 1.04728488 1.01314041 1.00466911
 1.01368626 1.04123869 1.06317299 1.06386345]

[:::       ]  30.0% completed  (Sat Mar 21 08:52:14 2026)
Out-of-bound Trials:
[    0    60     0    11 31067     0    56     4     0  3581     8     0
     0     0     0     0]
Best Parameters: (chisq=369.0163)
[ 1.10352344e+00 -1.06147069e+00  9.35733623e-01 -9.21114850e-02
  3.00000000e-01  1.42788938e+03  2.88207408e+01  1.09418304e-01
  1.51745559e+03  3.58474238e+01  1.07059543e-01  6.59012777e-01
  1.22070957e-01  1.53688190e-02 -3.11486464e-04  1.23428191e+00]
Gelman-Rubin statistics for free parameters:
[1.00521861 1.00474335 1.00077643 1.00250263 1.0051052  1.00375276
 1.00404438 1.00231312 1.00086072 1.00721792 1.0011775  1.00077979
 1.00662006 1.00314879 1.00490167 1.00437601]
All parameters converged to within 1% of unity.

[::::      ]  40.0% completed  (Sat Mar 21 08:52:19 2026)
Out-of-bound Trials:
[    0   140     0    28 43593     0   112     5     0  6947     9     0
     0     0     0     0]
Best Parameters: (chisq=369.0163)
[ 1.10352344e+00 -1.06147069e+00  9.35733623e-01 -9.21114850e-02
  3.00000000e-01  1.42788938e+03  2.88207408e+01  1.09418304e-01
  1.51745559e+03  3.58474238e+01  1.07059543e-01  6.59012777e-01
  1.22070957e-01  1.53688190e-02 -3.11486464e-04  1.23428191e+00]
Gelman-Rubin statistics for free parameters:
[1.00268298 1.00270818 1.0008663  1.00163829 1.00245516 1.00301832
 1.00148402 1.0024456  1.00176316 1.00402805 1.0019037  1.00082382
 1.00238695 1.00116129 1.00272493 1.0025363 ]
All parameters converged to within 1% of unity.

[:::::     ]  50.0% completed  (Sat Mar 21 08:52:24 2026)
Out-of-bound Trials:
[    0   225     0    46 55676     1   171     7     0 10868     9     0
     0     0     0     0]
Best Parameters: (chisq=369.0163)
[ 1.10352344e+00 -1.06147069e+00  9.35733623e-01 -9.21114850e-02
  3.00000000e-01  1.42788938e+03  2.88207408e+01  1.09418304e-01
  1.51745559e+03  3.58474238e+01  1.07059543e-01  6.59012777e-01
  1.22070957e-01  1.53688190e-02 -3.11486464e-04  1.23428191e+00]
Gelman-Rubin statistics for free parameters:
[1.00126908 1.00129434 1.0007393  1.00082461 1.00165687 1.00083961
 1.0011652  1.00089968 1.00049386 1.00076211 1.00109321 1.00039088
 1.00186779 1.00052743 1.00128582 1.00121653]
All parameters converged to within 1% of unity.

[::::::    ]  60.0% completed  (Sat Mar 21 08:52:29 2026)
Out-of-bound Trials:
[    0   297     0    60 67828     1   230     8     0 14522    10     0
     0     0     0     0]
Best Parameters: (chisq=369.0163)
[ 1.10352344e+00 -1.06147069e+00  9.35733623e-01 -9.21114850e-02
  3.00000000e-01  1.42788938e+03  2.88207408e+01  1.09418304e-01
  1.51745559e+03  3.58474238e+01  1.07059543e-01  6.59012777e-01
  1.22070957e-01  1.53688190e-02 -3.11486464e-04  1.23428191e+00]
Gelman-Rubin statistics for free parameters:
[1.0006177  1.0006618  1.00042321 1.00043509 1.0013354  1.0007525
 1.00103297 1.00089818 1.00039832 1.00066618 1.00061866 1.00033533
 1.00068634 1.00029928 1.0006526  1.00063551]
All parameters converged to within 1% of unity.

All parameters satisfy the GR convergence threshold of 1.01, stopping
the MCMC.

MCMC Summary:
-------------
  Number of evaluated samples:        602505
  Number of parallel chains:               9
  Average iterations per chain:        66945
  Burned-in iterations per chain:      20000
  Thinning factor:                         5
  MCMC sample size (thinned, burned):  84501
  Acceptance rate:   21.82%

Parameter name     best fit   median      1sigma_low   1sigma_hi        S/N
--------------- -----------  -----------------------------------  ---------
B_mean           1.1035e+00   1.1309e+00 -3.2298e-02  4.0162e-02       29.7
B_PC1           -1.0615e+00  -7.7860e-01 -3.5156e-01  4.1930e-01        2.7
B_PC2            9.3573e-01   9.3289e-01 -2.0209e-02  2.0028e-02       46.7
B_PC3           -9.2111e-02  -5.7800e-02 -5.5978e-02  6.2137e-02        1.6
B_PC4            3.0000e-01   2.7897e-01 -2.9114e-02  1.5499e-02       12.8
G1430_peak       1.4279e+03   1.4279e+03 -1.3796e+00  1.4466e+00     1015.8
G1430_std        2.8821e+01   2.8759e+01 -1.1639e+00  1.2536e+00       23.9
G1430_amp        1.0942e-01   1.0864e-01 -4.0385e-03  4.0005e-03       27.2
G1515_peak       1.5175e+03   1.5176e+03 -1.5280e+00  1.5477e+00      988.1
G1515_std        3.5847e+01   3.6081e+01 -2.0434e+00  2.1675e+00       18.3
G1515_amp        1.0706e-01   1.0687e-01 -3.5668e-03  3.5514e-03       30.1
H1635_mean       6.5901e-01   6.5922e-01 -3.0151e-03  3.1052e-03      215.4
H1635_PC1        1.2207e-01   1.2012e-01 -1.4457e-02  1.4331e-02        8.4
H1635_PC2        1.5369e-02   1.4393e-02 -2.2173e-02  2.1394e-02        0.7
m               -3.1149e-04  -2.6339e-04 -5.9476e-05  7.1063e-05        4.7
b                1.2343e+00   1.2200e+00 -2.2149e-02  1.8870e-02       58.8

  Best-parameter's chi-squared:       367.5930
  Best-parameter's -2*log(posterior): 369.0163
  Bayesian Information Criterion:     469.8369
  Reduced chi-squared:                  0.6338
  Standard deviation of residuals:  0.00785345

For a detailed summary with all parameter posterior statistics see
/home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/NPZTXTFILES/RESULTS/AC4_OL49_021920_30x30_H2O_a_statistics.txt

Output sampler files:
  /home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/NPZTXTFILES/RESULTS/AC4_OL49_021920_30x30_H2O_a_statistics.txt
  /home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/NPZTXTFILES/RESULTS/AC4_OL49_021920_30x30_H2O_a.npz
  /home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/LOGFILES/RESULTS/AC4_OL49_021920_30x30_H2O_a.log

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
  Multi-core Markov-chain Monte Carlo (mc3).
  Version 3.2.1.
  Copyright (c) 2015-2026 Patricio Cubillos and collaborators.
  mc3 is open-source software under the MIT license (see LICENSE).
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
  Warning:
    The number of requested CPUs (4) is >= than the number of
available CPUs (2).  Enforced ncpu to 1.
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Least-squares best-fitting parameters:
  [ 5.11148229e-01 -2.96910154e+00  2.75566346e-01 -5.06622353e-01
  2.27844135e-01  1.42739189e+03  3.45255619e+01  5.04958695e-02
  1.51924154e+03  3.21665128e+01  5.31133463e-02  3.00627000e-01
 -3.90990568e-02  5.29060317e-02 -4.45749687e-04  7.21455933e-01]

Yippee Ki Yay Monte Carlo!
Start MCMC chains  (Sat Mar 21 08:52:40 2026)

[:         ]  10.0% completed  (Sat Mar 21 08:52:45 2026)
Out-of-bound Trials:
[   3 8595    0  246   99    3  192  148    0    7  218   16    0    0
    0    0]
Best Parameters: (chisq=49.0306)
[ 5.11148229e-01 -2.96910154e+00  2.75566346e-01 -5.06622353e-01
  2.27844135e-01  1.42739189e+03  3.45255619e+01  5.04958695e-02
  1.51924154e+03  3.21665128e+01  5.31133463e-02  3.00627000e-01
 -3.90990568e-02  5.29060317e-02 -4.45749687e-04  7.21455933e-01]

[::        ]  20.0% completed  (Sat Mar 21 08:52:50 2026)
Out-of-bound Trials:
[    4 16102     0   466   346    68  2420   158    21   598   248    16
     0     0     0     0]
Best Parameters: (chisq=49.0306)
[ 5.11148229e-01 -2.96910154e+00  2.75566346e-01 -5.06622353e-01
  2.27844135e-01  1.42739189e+03  3.45255619e+01  5.04958695e-02
  1.51924154e+03  3.21665128e+01  5.31133463e-02  3.00627000e-01
 -3.90990568e-02  5.29060317e-02 -4.45749687e-04  7.21455933e-01]
Gelman-Rubin statistics for free parameters:
[1.02076236 1.02467033 1.01400152 1.03513932 1.03239383 1.00864323
 1.02701389 1.04614775 1.01132106 1.01727461 1.03861216 1.00686197
 1.01263586 1.01811287 1.02299379 1.02706944]

[:::       ]  30.0% completed  (Sat Mar 21 08:52:55 2026)
Out-of-bound Trials:
[    7 24251     0  1009   982   235  5370   165    47  2036   266    17
     0     0     0     0]
Best Parameters: (chisq=49.0306)
[ 5.11148229e-01 -2.96910154e+00  2.75566346e-01 -5.06622353e-01
  2.27844135e-01  1.42739189e+03  3.45255619e+01  5.04958695e-02
  1.51924154e+03  3.21665128e+01  5.31133463e-02  3.00627000e-01
 -3.90990568e-02  5.29060317e-02 -4.45749687e-04  7.21455933e-01]
Gelman-Rubin statistics for free parameters:
[1.00544631 1.00547109 1.00319301 1.00640567 1.00804962 1.00176744
 1.00640876 1.00631558 1.0019974  1.00225273 1.00479421 1.00196052
 1.00073164 1.00386915 1.00523967 1.00528807]
All parameters converged to within 1% of unity.

[::::      ]  40.0% completed  (Sat Mar 21 08:53:00 2026)
Out-of-bound Trials:
[   11 33556     0  1573  1734   417  8917   181    87  3739   277    17
     0     0     0     0]
Best Parameters: (chisq=49.0306)
[ 5.11148229e-01 -2.96910154e+00  2.75566346e-01 -5.06622353e-01
  2.27844135e-01  1.42739189e+03  3.45255619e+01  5.04958695e-02
  1.51924154e+03  3.21665128e+01  5.31133463e-02  3.00627000e-01
 -3.90990568e-02  5.29060317e-02 -4.45749687e-04  7.21455933e-01]
Gelman-Rubin statistics for free parameters:
[1.00390637 1.0038337  1.00157842 1.00369581 1.00354223 1.00132066
 1.00397326 1.00248246 1.00128719 1.00167506 1.00150512 1.00056814
 1.00058125 1.0019471  1.00377545 1.00362818]
All parameters converged to within 1% of unity.

[:::::     ]  50.0% completed  (Sat Mar 21 08:53:05 2026)
Out-of-bound Trials:
[   11 43735     0  2284  2579   625 12450   189   132  5939   287    17
     0     0     0     0]
Best Parameters: (chisq=49.0306)
[ 5.11148229e-01 -2.96910154e+00  2.75566346e-01 -5.06622353e-01
  2.27844135e-01  1.42739189e+03  3.45255619e+01  5.04958695e-02
  1.51924154e+03  3.21665128e+01  5.31133463e-02  3.00627000e-01
 -3.90990568e-02  5.29060317e-02 -4.45749687e-04  7.21455933e-01]
Gelman-Rubin statistics for free parameters:
[1.0027205  1.00263884 1.00074028 1.00221174 1.00264945 1.00029112
 1.00195089 1.00153947 1.00096123 1.00101372 1.00069723 1.0005901
 1.00110861 1.0012097  1.00256906 1.0024207 ]
All parameters converged to within 1% of unity.

[::::::    ]  60.0% completed  (Sat Mar 21 08:53:10 2026)
Out-of-bound Trials:
[   13 54053     0  3005  3442   824 16229   205   187  7817   299    18
     0     0     0     0]
Best Parameters: (chisq=49.0306)
[ 5.11148229e-01 -2.96910154e+00  2.75566346e-01 -5.06622353e-01
  2.27844135e-01  1.42739189e+03  3.45255619e+01  5.04958695e-02
  1.51924154e+03  3.21665128e+01  5.31133463e-02  3.00627000e-01
 -3.90990568e-02  5.29060317e-02 -4.45749687e-04  7.21455933e-01]
Gelman-Rubin statistics for free parameters:
[1.00267714 1.00245252 1.0006809  1.00180584 1.00248326 1.00044293
 1.0011652  1.00100646 1.00047455 1.00074224 1.00053807 1.00053531
 1.00058745 1.00074834 1.0024796  1.00220433]
All parameters converged to within 1% of unity.

All parameters satisfy the GR convergence threshold of 1.01, stopping
the MCMC.

MCMC Summary:
-------------
  Number of evaluated samples:        602685
  Number of parallel chains:               9
  Average iterations per chain:        66965
  Burned-in iterations per chain:      20000
  Thinning factor:                         5
  MCMC sample size (thinned, burned):  84537
  Acceptance rate:   23.43%

Parameter name     best fit   median      1sigma_low   1sigma_hi        S/N
--------------- -----------  -----------------------------------  ---------
B_mean           5.1115e-01   5.5629e-01 -3.4267e-02  5.1998e-02       11.7
B_PC1           -2.9691e+00  -2.5032e+00 -3.5175e-01  5.4184e-01        6.6
B_PC2            2.7557e-01   2.7035e-01 -2.0716e-02  2.0067e-02       13.5
B_PC3           -5.0662e-01  -4.4757e-01 -4.8635e-02  7.1521e-02        8.3
B_PC4            2.2784e-01   1.8912e-01 -4.5157e-02  3.8014e-02        5.4
G1430_peak       1.4274e+03   1.4272e+03 -3.1154e+00  3.1299e+00      459.5
G1430_std        3.4526e+01   3.4615e+01 -2.9901e+00  2.8321e+00       12.6
G1430_amp        5.0496e-02   4.9979e-02 -4.2514e-03  4.2179e-03       11.9
G1515_peak       1.5192e+03   1.5195e+03 -2.8096e+00  2.7830e+00      540.1
G1515_std        3.2167e+01   3.2923e+01 -2.6306e+00  3.0353e+00       11.7
G1515_amp        5.3113e-02   5.3276e-02 -3.6329e-03  3.6454e-03       14.5
H1635_mean       3.0063e-01   3.0112e-01 -3.1097e-03  3.0681e-03       97.7
H1635_PC1       -3.9099e-02  -4.1670e-02 -1.4664e-02  1.4383e-02        2.7
H1635_PC2        5.2906e-02   5.1888e-02 -1.9658e-02  1.8198e-02        2.8
m               -4.4575e-04  -3.6666e-04 -5.9626e-05  9.1579e-05        5.9
b                7.2146e-01   6.9785e-01 -2.7889e-02  1.7961e-02       31.4

  Best-parameter's chi-squared:        48.0236
  Best-parameter's -2*log(posterior):  49.0306
  Bayesian Information Criterion:     150.2674
  Reduced chi-squared:                  0.0828
  Standard deviation of residuals:  0.0028386

For a detailed summary with all parameter posterior statistics see
/home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/NPZTXTFILES/RESULTS/AC4_OL53_101220_256s_30x30_a_statistics.txt

Output sampler files:
  /home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/NPZTXTFILES/RESULTS/AC4_OL53_101220_256s_30x30_a_statistics.txt
  /home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/NPZTXTFILES/RESULTS/AC4_OL53_101220_256s_30x30_a.npz
  /home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/LOGFILES/RESULTS/AC4_OL53_101220_256s_30x30_a.log

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
  Multi-core Markov-chain Monte Carlo (mc3).
  Version 3.2.1.
  Copyright (c) 2015-2026 Patricio Cubillos and collaborators.
  mc3 is open-source software under the MIT license (see LICENSE).
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
  Warning:
    The number of requested CPUs (4) is >= than the number of
available CPUs (2).  Enforced ncpu to 1.
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Least-squares best-fitting parameters:
  [ 4.00000000e+00  7.53559746e-01 -4.45578602e-01 -3.77207870e-01
  1.77129404e-01  1.43150803e+03  3.10778684e+01  6.77796785e-02
  1.52279967e+03  3.51836439e+01  7.59276772e-02  1.73134548e-01
  6.68571636e-02 -3.77537031e-02 -2.32934293e-04  2.46271035e+00]

Yippee Ki Yay Monte Carlo!
Start MCMC chains  (Sat Mar 21 08:53:22 2026)

[:         ]  10.0% completed  (Sat Mar 21 08:53:26 2026)
Out-of-bound Trials:
[18602     0     0    20    67     1    37    72     1    93   120     2
     0     0     0     0]
Best Parameters: (chisq=1775.0923)
[ 4.00000000e+00  7.53559746e-01 -4.45578602e-01 -3.77207870e-01
  1.77129404e-01  1.43150803e+03  3.10778684e+01  6.77796785e-02
  1.52279967e+03  3.51836439e+01  7.59276772e-02  1.73134548e-01
  6.68571636e-02 -3.77537031e-02 -2.32934293e-04  2.46271035e+00]

[::        ]  20.0% completed  (Sat Mar 21 08:53:31 2026)
Out-of-bound Trials:
[34960     0     0    42   149     5   214    83    19   717   126     4
     0     0     0     0]
Best Parameters: (chisq=1775.0923)
[ 4.00000000e+00  7.53559746e-01 -4.45578602e-01 -3.77207870e-01
  1.77129404e-01  1.43150803e+03  3.10778684e+01  6.77796785e-02
  1.52279967e+03  3.51836439e+01  7.59276772e-02  1.73134548e-01
  6.68571636e-02 -3.77537031e-02 -2.32934293e-04  2.46271035e+00]
Gelman-Rubin statistics for free parameters:
[1.0069463  1.00760957 1.0111384  1.00580479 1.02531698 1.05098037
 1.05185213 1.02747599 1.01490951 1.08809746 1.02020589 1.01020207
 1.02102386 1.03840493 1.01251705 1.00932543]

[:::       ]  30.0% completed  (Sat Mar 21 08:53:36 2026)
Out-of-bound Trials:
[50420     2     0    94   249    24   588    93    61  4299   130     4
     0     0     0     0]
Best Parameters: (chisq=1775.0923)
[ 4.00000000e+00  7.53559746e-01 -4.45578602e-01 -3.77207870e-01
  1.77129404e-01  1.43150803e+03  3.10778684e+01  6.77796785e-02
  1.52279967e+03  3.51836439e+01  7.59276772e-02  1.73134548e-01
  6.68571636e-02 -3.77537031e-02 -2.32934293e-04  2.46271035e+00]
Gelman-Rubin statistics for free parameters:
[1.00416043 1.00214098 1.0010912  1.00128792 1.00420751 1.00523657
 1.00294074 1.00173551 1.00445291 1.00223495 1.00185895 1.0019691
 1.003742   1.0022123  1.002084   1.00150358]
All parameters converged to within 1% of unity.

[::::      ]  40.0% completed  (Sat Mar 21 08:53:41 2026)
Out-of-bound Trials:
[66228     4     0   148   386    49  1039    98   123  8840   134     4
     0     0     0     0]
Best Parameters: (chisq=1775.0923)
[ 4.00000000e+00  7.53559746e-01 -4.45578602e-01 -3.77207870e-01
  1.77129404e-01  1.43150803e+03  3.10778684e+01  6.77796785e-02
  1.52279967e+03  3.51836439e+01  7.59276772e-02  1.73134548e-01
  6.68571636e-02 -3.77537031e-02 -2.32934293e-04  2.46271035e+00]
Gelman-Rubin statistics for free parameters:
[1.00307498 1.00130711 1.00089884 1.00130698 1.00444038 1.00214697
 1.00138388 1.00283574 1.00072023 1.00266791 1.00076865 1.0010687
 1.0008727  1.00215994 1.00196773 1.00147294]
All parameters converged to within 1% of unity.

[:::::     ]  50.0% completed  (Sat Mar 21 08:53:46 2026)
Out-of-bound Trials:
[81538     5     0   204   523    86  1616   102   187 13686   135     4
     0     0     0     0]
Best Parameters: (chisq=1775.0923)
[ 4.00000000e+00  7.53559746e-01 -4.45578602e-01 -3.77207870e-01
  1.77129404e-01  1.43150803e+03  3.10778684e+01  6.77796785e-02
  1.52279967e+03  3.51836439e+01  7.59276772e-02  1.73134548e-01
  6.68571636e-02 -3.77537031e-02 -2.32934293e-04  2.46271035e+00]
Gelman-Rubin statistics for free parameters:
[1.00159811 1.00119059 1.00071577 1.00105406 1.00229662 1.00227327
 1.00071909 1.00188917 1.00046301 1.00208204 1.0008696  1.0011668
 1.00063619 1.00155944 1.00121443 1.00147034]
All parameters converged to within 1% of unity.

[::::::    ]  60.0% completed  (Sat Mar 21 08:53:51 2026)
Out-of-bound Trials:
[96408     7     0   265   686   138  2269   112   249 18328   140     4
     0     0     0     0]
Best Parameters: (chisq=1775.0923)
[ 4.00000000e+00  7.53559746e-01 -4.45578602e-01 -3.77207870e-01
  1.77129404e-01  1.43150803e+03  3.10778684e+01  6.77796785e-02
  1.52279967e+03  3.51836439e+01  7.59276772e-02  1.73134548e-01
  6.68571636e-02 -3.77537031e-02 -2.32934293e-04  2.46271035e+00]
Gelman-Rubin statistics for free parameters:
[1.00046286 1.00056808 1.00055624 1.00052816 1.00129271 1.00196271
 1.00074868 1.00065852 1.00033804 1.00162663 1.00036756 1.0007795
 1.00067767 1.00066627 1.00071579 1.00082269]
All parameters converged to within 1% of unity.

All parameters satisfy the GR convergence threshold of 1.01, stopping
the MCMC.

MCMC Summary:
-------------
  Number of evaluated samples:        602685
  Number of parallel chains:               9
  Average iterations per chain:        66965
  Burned-in iterations per chain:      20000
  Thinning factor:                         5
  MCMC sample size (thinned, burned):  84537
  Acceptance rate:   20.97%

Parameter name     best fit   median      1sigma_low   1sigma_hi        S/N
--------------- -----------  -----------------------------------  ---------
B_mean           4.0000e+00   3.9957e+00 -7.1727e-03  3.2242e-03      659.5
B_PC1            7.5356e-01   6.9416e-01 -9.9456e-02  8.9869e-02        7.7
B_PC2           -4.4558e-01  -4.4496e-01 -1.9319e-02  1.9027e-02       23.5
B_PC3           -3.7721e-01  -3.8590e-01 -2.6556e-02  2.6390e-02       14.1
B_PC4            1.7713e-01   1.8315e-01 -2.2506e-02  2.4357e-02        7.5
G1430_peak       1.4315e+03   1.4316e+03 -2.2034e+00  2.3586e+00      623.9
G1430_std        3.1078e+01   3.1430e+01 -2.2853e+00  2.5258e+00       12.9
G1430_amp        6.7780e-02   6.7501e-02 -3.9715e-03  3.9932e-03       16.9
G1515_peak       1.5228e+03   1.5232e+03 -2.3022e+00  2.5342e+00      631.2
G1515_std        3.5184e+01   3.5566e+01 -2.7506e+00  2.6275e+00       14.2
G1515_amp        7.5928e-02   7.5503e-02 -3.2393e-03  3.2859e-03       23.1
H1635_mean       1.7313e-01   1.7295e-01 -3.0938e-03  3.0314e-03       56.5
H1635_PC1        6.6857e-02   6.8068e-02 -1.4464e-02  1.4154e-02        4.7
H1635_PC2       -3.7754e-02  -4.1647e-02 -2.3456e-02  2.2982e-02        1.6
m               -2.3293e-04  -2.4248e-04 -1.4271e-05  1.2100e-05       16.5
b                2.4627e+00   2.4658e+00 -6.1473e-03  6.4411e-03      384.8

  Best-parameter's chi-squared:       1773.9710
  Best-parameter's -2*log(posterior): 1775.0923
  Bayesian Information Criterion:     1876.2148
  Reduced chi-squared:                   3.0586
  Standard deviation of residuals:  0.0172524

For a detailed summary with all parameter posterior statistics see
/home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/NPZTXTFILES/RESULTS/STD_D1010_012821_256s_100x100_a_statistics.txt

Output sampler files:
  /home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/NPZTXTFILES/RESULTS/STD_D1010_012821_256s_100x100_a_statistics.txt
  /home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/NPZTXTFILES/RESULTS/STD_D1010_012821_256s_100x100_a.npz
  /home/docs/checkouts/readthedocs.org/user_builds/pyiroglass/checkouts/latest/docs/examples/transmission_ftir/LOGFILES/RESULTS/STD_D1010_012821_256s_100x100_a.log

pig.calculate_baselines returns Volatile_PH, a DataFrame of the output peak heights and associated uncertainties. Let’s see what is included.

[10]:
Volatile_PH
[10]:
PH_3550_M PH_3550_STD H2Ot_3550_MAX BL_H2Ot_3550_MAX H2Ot_3550_SAT PH_1635_BP PH_1635_STD PH_1515_BP PH_1515_STD P_1515_BP ... PC4_BP PC4_STD m_BP m_STD b_BP b_STD PH_1635_PC1_BP PH_1635_PC1_STD PH_1635_PC2_BP PH_1635_PC2_STD
AC4_OL49_021920_30x30_H2O_a 2.17225 0.002212 2.649837 0.459859 * 0.659013 0.003059 0.107060 0.003555 1517.455587 ... 0.300000 0.023423 -0.000311 0.000067 1.234282 0.020997 0.122071 0.014448 0.015369 0.021622
AC4_OL53_101220_256s_30x30_a 1.523343 0.003309 1.631044 0.135541 - 0.300627 0.003078 0.053113 0.003662 1519.241536 ... 0.227844 0.042126 -0.000446 0.000076 0.721456 0.023000 -0.039099 0.014543 0.052906 0.018752
STD_D1010_012821_256s_100x100_a 2.710894 0.001587 2.915508 0.206641 * 0.173135 0.003065 0.075928 0.003292 1522.799673 ... 0.177129 0.023537 -0.000233 0.000014 2.462710 0.006400 0.066857 0.014302 -0.037754 0.022991

3 rows × 45 columns

We can look at all the columns in this DataFrame, given the size.

[11]:
Volatile_PH.columns
[11]:
Index(['PH_3550_M', 'PH_3550_STD', 'H2Ot_3550_MAX', 'BL_H2Ot_3550_MAX',
       'H2Ot_3550_SAT', 'PH_1635_BP', 'PH_1635_STD', 'PH_1515_BP',
       'PH_1515_STD', 'P_1515_BP', 'P_1515_STD', 'STD_1515_BP', 'STD_1515_STD',
       'PH_1430_BP', 'PH_1430_STD', 'P_1430_BP', 'P_1430_STD', 'STD_1430_BP',
       'STD_1430_STD', 'PH_5200_M', 'PH_5200_STD', 'PH_4500_M', 'PH_4500_STD',
       'STN_P5200', 'ERR_5200', 'STN_P4500', 'ERR_4500', 'AVG_BL_BP',
       'AVG_BL_STD', 'PC1_BP', 'PC1_STD', 'PC2_BP', 'PC2_STD', 'PC3_BP',
       'PC3_STD', 'PC4_BP', 'PC4_STD', 'm_BP', 'm_STD', 'b_BP', 'b_STD',
       'PH_1635_PC1_BP', 'PH_1635_PC1_STD', 'PH_1635_PC2_BP',
       'PH_1635_PC2_STD'],
      dtype='str')

All columns with the prefix of PH represent a peak height. All columns with the suffix of _M represent the mean value, and the suffix of _STD represents 1 \(\sigma\).

The column H2Ot_3550_SAT returns a - if the sample is not saturated, and a * if the sample is saturated. This is based on the maximum absorbance of the peak, and the warning of * indicates that we must consider the concentrations more. The following functions calculating concentration handle this and will suggest best values to use.

The columns STN_P5200 and STN_P4500 represent the signal to noise ratios for the \(\mathrm{H_2O_{m,5200}}\) and \(\mathrm{OH^-_{4500}}\) peaks. If the values are greater than 4, indicating that the signal is meaningful, the ERR_5200 and ERR_4500 peaks return a - value. If signal-to-noise is too low, the warning of * is returned.

The columns after describe the fitting parameters for generating the baseline and the \(\mathrm{H_2O_{m,1635}}\) peak, so you can generate the baseline yourself.

Outputs

Quite few figures, log files, and npz files are generated by pig.calculate_baselines, assuming you provide an export path. Let’s look at a few of them together.

PyIRoGlass creates this figure for visualizing how each peak within the 1000-5500 cm \(\mathrm{^{-1}}\) is fit, with their peak heights shown.

[12]:
Image("https://github.com/sarahshi/PyIRoGlass/raw/main/docs/_static/AC4_OL49_021920_30x30_H2O_a.png")
[12]:
../../_images/examples_transmission_ftir_PyIRoGlass_Transmission_27_0.png

We can visualize how well PyIRoGlass does in fitting this transmission FTIR spectrum, with the modelfit figure. This plots the fit from MC3 against the transmission FTIR spectrum, with the residual in fit.

[13]:
Image("https://github.com/sarahshi/PyIRoGlass/raw/main/docs/_static/AC4_OL49_021920_30x30_H2O_a_modelfit.png")
[13]:
../../_images/examples_transmission_ftir_PyIRoGlass_Transmission_29_0.png

The histogram figure shows the distribution of posterior probability densities, with the mean value displayed in the navy dashed line. The shaded region represents the 68% confidence interval around the value.

[14]:
Image("https://github.com/sarahshi/PyIRoGlass/raw/main/docs/_static/AC4_OL49_021920_30x30_H2O_a_histogram.png")
[14]:
../../_images/examples_transmission_ftir_PyIRoGlass_Transmission_31_0.png

The pairwise figure plots the posterior probability density distribution for the 16 fitting parameters of Equation 10, allowing for the visualization of covariance within the parameters. Accounting for covariance allows us to properly account for uncertainty.

[15]:
Image("https://github.com/sarahshi/PyIRoGlass/raw/main/docs/_static/AC4_OL49_021920_30x30_H2O_a_pairwise.png")
[15]:
../../_images/examples_transmission_ftir_PyIRoGlass_Transmission_33_0.png

The trace figure shows how the parameters evolve through MCMC sampling.

[16]:
Image("https://github.com/sarahshi/PyIRoGlass/raw/main/docs/_static/AC4_OL49_021920_30x30_H2O_a_trace.png")
[16]:
../../_images/examples_transmission_ftir_PyIRoGlass_Transmission_35_0.png

LOG and NPZ

.log files record the performance of the MCMC algorithm through the samples, and the best parameters at each 10% increment. These are shown above.

.npz files store all the best-parameters, sampled parameters, etc. in a ready-to-use NumPy format.

We won’t open these here, but these are quite useful to review!

Concentrations

We now want to convert all those peak heights (with uncertainties) to concentrations (with uncertainties), by applying the Beer-Lambert Law. We do so by using the pig.calculate_concentrations function, which takes in these parameters and samples over N samples for a secondary MCMC:

  • Volatile_PH: DataFrame with peak heights and their associated uncertainties; the output from pig.calculate_baselines

  • chemistry: DataFrame of chemical data

  • thickness: DataFrame of thickness data

  • export_path: Output directory name, or None to prevent figure generation

  • N: Number of Monte Carlo simulations to perform for uncertainty estimation, with default of 500,000

  • T: Temperature in Celsius at which the density is calculated, with default of 25°C

  • P: Pressure in bars at which the density is calculated, with default of 1 bar

  • model: Choice of density model, with options of the default “LS” for Lesher and Spera (2015) and “IT” for Iacovino and Till (2019)

[17]:
concentrations_df = pig.calculate_concentrations(Volatile_PH, chemistry, thickness, export_path)

We’re all done now! Let’s display the concentrations_df DataFrame, which contains all results.

[18]:
concentrations_df
[18]:
H2Ot_MEAN H2Ot_STD H2Ot_3550_M H2Ot_3550_STD H2Ot_3550_SAT H2Om_1635_BP H2Om_1635_STD CO2_MEAN CO2_STD CO2_1515_BP ... epsilon_H2Ot_3550 sigma_epsilon_H2Ot_3550 epsilon_H2Om_1635 sigma_epsilon_H2Om_1635 epsilon_CO2 sigma_epsilon_CO2 epsilon_H2Om_5200 sigma_epsilon_H2Om_5200 epsilon_OH_4500 sigma_epsilon_OH_4500
AC4_OL49_021920_30x30_H2O_a 2.548482 0.160921 2.420678 0.189311 * 1.301476 0.189927 754.152837 36.497233 745.935523 ... 66.142594 7.503769 37.322108 8.645060 258.429949 18.362798 1.009458 0.300803 0.861196 0.279571
AC4_OL53_101220_256s_30x30_a 4.036918 0.431818 4.036918 0.431818 - 1.49134 0.258725 737.553152 61.542834 756.185937 ... 64.493655 7.380834 34.452486 8.504269 293.261300 16.287120 0.901474 0.295829 0.779611 0.274924
STD_D1010_012821_256s_100x100_a 0.901034 0.096635 1.201368 0.087969 * 0.153231 0.025371 156.607585 7.173708 165.487008 ... 62.751984 7.251813 31.421482 8.354348 311.700491 15.127604 0.787418 0.290510 0.693438 0.270004

3 rows × 35 columns

There are a few things to note. Each column with the suffix _MEAN represents the mean value, _BP represents the best-parameter from MCMC, and _STD represents the standard deviation. We recommend the use of the H2Ot_MEAN, H2Ot_STD, CO2_MEAN, and CO2_STD columns. The columns with the suffix _STN show the signal-to-noise ratio of the NIR peaks, and the columns with the prefix ERR_ just process this information, returning a - if the peaks are meaningful and a * if the signal is too low.

Concentrations of \(\mathrm{H_2O}\) depend on whether your sample is saturated or not. If your sample is unsaturated (marked by H2Ot_3550_SAT=='-'), the column H2Ot_MEAN==H2Ot_3550_M. If your sample is saturated (marked by H2Ot_3550_SAT=='*'), the column of H2Ot_MEAN==(H2Om_1635_BP+OH_4500_M). The \(\mathrm{H_2O_{t, 3550}}\) peak cannot be used, given potential nonlinearity in the Beer-Lambert Law. See the discussion of this handling of speciation in the paper.

The column Density contains the densities used for the final concentration. The values between Density and Density_Sat will be different if the sample is saturated, showing the difference in densities when using variable concentrations of \(\mathrm{H_2O_m}\).

Tau and Eta calculate the compositional parameters required for determining molar absorptivity. All calculated molar absorptivities and their uncertainties (Sigma_ prefix) from the inversion are provided in the DataFrame.