Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Advertisement
Scientific Reports volume 16, Article number: 15924 (2026)
1664
19
Metrics details
Healthcare-associated infections (HCAIs) contribute significantly to global mortality, driven by the increasing antimicrobial resistance. Rapid, high-throughput bacterial detection is crucial for infection control and patient care. We report a real-time, multiplex lamp-based Photoionization Detector (PID) assisted by AI-image-based analysis for bacterial identification. Using four lamps with varying ionization energies, the sensor selectively ionizes VOCs emitted by bacteria, producing four distinct current curves for each target species (Escherichia coli, Staphylococcus aureus, Pseudomonas aeruginosa, and Klebsiella pneumoniae). These curves were transformed into image representations, capturing their spectral patterns for bacterial differentiation. A pre-trained ResNet-18 Convolutional Neural Network (CNN) within a Few-Shot Learning (FSL) framework extracted key features, enabling accurate (> 88%) bacterial differentiation even with limited labeled data. This sensor detected bacterial concentrations as low as 10² CFU and distinguished contamination levels. The synergistic integration of PID sensing with AI-driven analysis offers a powerful approach to rapid bacterial diagnostics, demonstrating strong potential for clinical implementation and improved patient care. This study marks an early step toward AI-based VOC sensing, where FSL acts as a proof-of-concept under data scarcity.
Healthcare-associated infections (HCAIs) have been arising as an important cause of mortality worldwide mainly due to the increasing escalation of antimicrobial resistance1. HCAIs are characterized by the contraction of illness while receiving healthcare in various healthcare settings, from long-term care facilities and hospitals to primary care clinics. These infections are detected at least 48 h after hospitalization or within 30 days of receiving healthcare2,3. It is estimated that 1.7 million patients per year acquire HCAIs while being treated for other health conditions, resulting in over 98,000 deaths4. Examples of HCAIs include urinary tract, bloodstream, and respiratory infections (e.g., pneumonia)5 and the main causative agents are the ESKAPE-E pathogens (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, Enterobacter species, and Escherichia coli)2. Therefore, to reduce both the incidence and mortality associated with HCAIs, early detection of the causative pathogens is essential for effective treatment and for preventing severe complications6. This has led to increased interest in faster and more accurate detection technologies to enable prompt targeted antimicrobial therapy and help mitigate antimicrobial resistance7. Rapid diagnostic tests are critical for reducing reliance on empirical broad-spectrum antibiotic use by enabling earlier pathogen identification. Moreover, faster detection facilitates earlier antimicrobial susceptibility testing (AST), supporting improved clinical decision-making7,8,9.
While conventional bacterial detection methods, such as culture and enumeration techniques, molecular approaches, and immunological-based methods, can provide detection within a few days, they have shortcomings. Lack of high sensitivity, laborious protocols, need for expert personnel, and expensive equipment, besides not being suitable for a point-of-care application, are some of the drawbacks10,11. Moreover, these methods do not afford real-time monitoring of bacteria or their detection without an invasive sample collection.
Bacteria emit a matrix of volatile organic compounds (VOCs) resulting from their metabolism12. Common bacterial VOCs include alcohols, ketones, benzenoids, terpenoids, or sulfur-containing compounds12. These chemical molecules possess high vapor pressure at room temperature, thus they have increased volatility and mobility through the environment13. It is known that each bacterial species owns a characteristic VOC fingerprint which could help in their differentiation14,15. Hence, VOC profiling has been proposed as an alternative tool for bacterial detection and identification16,17,18,19,20. Bacterial differentiation can be achieved through the emission of significantly different VOCs at distinct concentrations. However, real-time detection and identification of bacterial species via their associated VOC fingerprints remain outstanding challenges.
Several studies have focused on bacterial VOC detection using electronic noses21,22,23, gas-chromatography mass spectrometers (GC-MS)16,17,19, metal oxide sensors (MOS)21, ion mobility spectroscopy (IMS)24,25, or secondary electrospray ionization mass spectrometry (SESI-MS)26,27. Notwithstanding the potential of the technologies mentioned, they display some downsides. GC-MS, the gold standard for VOC detection and identification, is a costly technique, requires expert personnel, does not allow real-time monitoring, and analysis is time-consuming28. Particularly, the detection of bacterial VOCs requires pre-concentration using techniques such as Solid-Phase Microextraction (SPME), since without this step it is not possible to reach the detection limits of the analytical equipment19,29,30. Other technologies are often expensive, involve specialized equipment, and have high energy consumption13. Moreover, they can suffer cross-sensitivity, and drift over time, besides being affected by humidity and temperature13,28.
Photoionization detectors (PID) are based on the emission of vacuum ultraviolet (VUV) photons by a light source with a specific energy31. The VUV photons allow the ionization of most organic gas molecules at concentrations ranging from sub-parts per billion to parts per million but not the main constituents of the air13. Each VOC has a unique threshold energy for molecular ionization, known as the Ionization Potential (IP). Thus, if the VUV photon energy exceeds the VOC IP, the PID can ionize the VOC. This feature dictates the broad application of PIDs, although each VOC has a distinct response factor32,33. Recent advances in PID technology have made these low-cost, portable (compact and light), and more resistant to temperature and humidity32. In particular, the PID-based system presented in this work does not rely on analyte-specific extraction, chromatographic separation, or compound identification. Instead, it measures the overall dynamic response of bacterial VOC emissions directly in the headspace in real time, allowing bacterial detection directly from a culture bottle without any sample preparation or VOC pre-concentration, as is required for GC-MS17,34. Consequently, pre-enrichment steps such as SPME and time-consuming spectral acquisitions are not necessary26,27.
Raman spectroscopy also leverages molecular fingerprints for bacterial identification. However, its practical deployment is limited by the need for complex and costly optical instrumentation, including lasers, spectrometers, and sensitive detectors35,36. Additionally, reliable measurements often require careful sample preparation and controlled experimental conditions, which restrict portability and routine field use37,38.
Studies on VOC profiling have utilized clustering techniques39 and dimensionality reduction40, but relatively few have explored supervised machine learning (ML) approaches. A major limitation has been the lack of sufficient data to effectively train such models41. While other similar studies employing Convolutional Neural Network (CNN) with larger datasets and different scopes were able to perform extensive training and achieve better classification performance42,43, in this study, the time-consuming nature of data generation further reinforced this limitation. To address this constraint, we adopted a Few-Shot Learning (FSL) strategy, enabling accurate classification despite the reduced number of labeled examples per class44. In this initial stage of artificial intelligence (AI) implementation, where data availability is inherently limited and acquisition is costly, FSL serves as a practical framework to explore the feasibility of automated discrimination under constrained conditions. Rather than being advantageous solely due to data scarcity, its application here establishes a methodological foundation for subsequent development stages. As larger yet still data-constrained datasets become available, the approach could naturally progress toward meta-learning strategies, where FSL principles can be extended through episodic learning to enhance generalization and adaptability across related tasks45.
This work describes the development of a real-time multiplex lamp-based PID, in which four lamps with different VUV photon energies were used to simultaneously acquire four real-time current curves for each target bacterial species (E. coli, S. aureus, P. aeruginosa, and K. pneumoniae). The sensing strategy focuses on monitoring the temporal response pattern of the signal, which is subsequently analyzed using AI-based methods to extract diagnostic information.
Although many rapid bacterial detection approaches have been reported, including optical/imaging methods, electrochemical and impedance sensors, and microfluidic platforms, these techniques often rely on fluorescent labeling, functionalized electrodes, or complex fabrication and sample preparation steps46,47,48,49,50. In contrast, the PID + AI approach presented here enables label-free VOC detection with no sample preparation and a simple sensing platform, offering a more streamlined strategy for rapid bacterial detection. Another key advantage of this system is its versatility across different background matrices; as long as bacterial VOC patterns remain distinguishable from the surrounding medium, accurate identification is possible, even in complex samples such as urine or blood, which typically produce highly characteristic VOC profiles51,52.
To achieve bacterial differentiation, we transformed current curves into image representations, capturing their distinct patterns. Specifically, a pre-trained CNN, ResNet-1853 was employed within an FSL framework44. To the best of our knowledge, this is the first study to report a four-lamp-based PID for bacterial detection and discrimination, supported by an image-based AI algorithm.
This study employed a PID sensor equipped with four lamps emitting distinct ionization energies (8.4, 9.6, 10, and 10.6 eV) to generate real-time curves for four target bacterial species: E. coli, P. aeruginosa, S. aureus, and K. pneumoniae. The core concept of the PID sensor was previously described54. Here, the setup comprises four sensors arranged in an array. A schematic representation of the homemade setup for bacterial VOC detection is displayed in Fig. 1.
The resulting readouts of the four PID sensors are curves that represent the measured current during the ionization of VOCs emitted by the bacteria. This approach constitutes an indirect bacterial detection method, as bacteria are identified based on the unique VOC profiles they produce during growth. The resulting current curves obtained for each PID sensor at a concentration of 105 CFU/mL are depicted in Fig. 2 (see Supplementary Material Figures S1-S3 for data at other bacterial concentrations).
Schematic representation of the devised homemade setup for bacterial VOC detection by the lamp-based PID. (A) Representation of the ionization process occurring inside the chamber and components involved, such as VUV lamp and Transimpedance Amplifier (TIA); (B) Photograph of one ionization chamber used in the setup; (C) Schematic overview of the complete setup, including the flask for trapping the bacterial VOCs, the four PIDs equipped with different ionization lamps and corresponding chambers, and the Mass Flow controllers (MFC).
The results demonstrate distinct current curves for each bacterial species when comparing signals from different bacteria at the same ionization energy. It is well-established that each bacterial species produces a unique VOC matrix14,55. As summarized in Supplementary Tables S1–S4, the four target bacterial species emit a range of VOCs across the four selected ionization levels. While certain VOCs are common to more than one species, the overall emission patterns are distinct, reflecting species-specific VOC signatures. Therefore, the findings from our PID sensors corroborate this expectation, revealing distinct current signals generated by the different target bacterial species.
Current (nA) recorded over the 20-minute measurement period for each ionization lamp (A) 8.4 eV, (B) 9.6 eV, (C) 10 eV, and (D) 10.6 eV for each of the four bacterial species at 105 CFU/mL. Data represents the average signal of at least three independent assays for each lamp and bacterial species. Control signals were acquired for non-inoculated Mueller-Hinton media (MH).
It is essential to note that the signals from the four PID sensors must be considered independently (see methods section) due to the different sensor parameters (such as illumination intensity and electric field). Therefore, when comparing the signal of a specific bacterial species at a fixed concentration across different ionization energies (i.e., different lamps), direct comparison is not appropriate. This explains why signals obtained with the 8.4 eV lamp are generally higher than those from other lamps, while signals from the 10 eV lamp are typically lower despite its higher ionization energy. However, comparisons between the four bacterial species within the same lamp can be made.
P. aeruginosa consistently exhibited the highest mean current signal across the four ionization energies for the different concentrations (as evidenced in Fig. 2 and Supplementary Figures S1-S3). This trend generally held true at the representative concentration of 105 CFU/mL (Fig. 2). P. aeruginosa and K. pneumoniae consistently produced the strongest signals, indicating that these species are likely to generate higher concentrations of multiple VOCs. Our findings are consistent with previous literature, which has reported a broad range of VOCs emitted by P. aeruginosa in both culture media and clinical samples (Supplemental Table S3) [56,57,58(. Consequently, current signals for P. aeruginosa frequently approached the sensor’s saturation limit (150 nA), indicating substantial VOC emission, particularly at higher bacterial concentrations. K. pneumoniae also demonstrated high VOC production, with high signals (≈ 100–150 nA) at elevated bacterial concentrations. It is important to note that for both species, actual VOC levels may be higher than the measured values, as the sensor has reached saturation (usually around 600 s). This potential sensor limitation could explain the observed similarities in signal readings between P. aeruginosa and K. pneumoniae for high bacterial contents, such as 105 CFU/mL. Figure 2 demonstrates that E. coli and S. aureus exhibit unique current profiles that are markedly different from those of P. aeruginosa and K. pneumoniae across all four lamp settings. These distinctive patterns facilitate the unambiguous differentiation of these two species.
This research investigated the potential quantification capabilities of our developed sensor. The hypothesis underlying is the direct correlation between bacterial concentration and VOC production. Higher bacterial concentrations result in a larger number of actively growing colonies. This increased microbial activity translates to enhanced metabolic processes, resulting in a concomitant increase in the concentration and diversity of emitted VOCs.
Assessment of the lamp-based PID response to different bacterial concentrations in all the lamps. (A) K. pneumoniae, (B) P. aeruginosa, (C) E. coli, and (D) S. aureus. Data represents the average signal of at least 3 independent assays for each lamp and bacterial species. NC stands for Negative Control (non-inoculated MH) * Statistically significant difference (p < 0.05) relative to the respective control samples (without bacteria) performed with two-way ANOVA with Tukey’s multiple comparison test.
As depicted in Fig. 3, our sensor successfully detected all bacterial species across a range of concentrations (102, 103, 105 and 107 CFU/mL).
When evaluating all ionization lamps collectively, the lamp-based PID exhibited a minimum detection limit of 102 CFU across all bacterial species (Fig. 3). Notably, at least one of the lamps (mainly the 10.6 eV lamp) demonstrated a significantly different signal (p < 0.05) compared to the control for all four species, confirming the effective detection of low bacterial content. Specifically, for K. pneumoniae (Fig. 3 – A), both the 8.4 eV and 10.6 eV lamps have yielded signals significantly different from the control. Additionally, as mentioned before, P. aeruginosa generated the strongest signals overall. The values for this species ranged from 5.046 ± 0.329 nA to 128.986 ± 6.318 nA (Fig. 3-B).
Overall, concentration-dependent response signals were observed, with a particularly notable relationship in the case of P. aeruginosa (Fig. 3 – B). The two-way ANOVA combined with Tukey’s multiple comparisons test was used to determine whether there was a statistically significant difference among the four levels of bacterial content examined. In the case of P. aeruginosa, the lamp’s signal expressed the ability to distinguish among the four concentration levels with high statistical significance, except for the 8.4 eV lamp. From the other three lamp data, it is possible to interpolate the expected signal for concentrations ranging from 100 to 107 CFU/mL. However, the differences between signals at the four concentration levels were not statistically significant for the other species (Figs. 3 – A, C, and D). Nonetheless, significant differences were observed when comparing low bacterial concentrations (102 and 103 CFU/mL) with high concentrations (105 and 107 CFU/mL), indicating that these levels are distinguishable. This suggests the sensor can reliably determine whether the bacterial content is low or high, independent of the bacterial species present.
The signals at high bacterial concentrations (10⁵ and 10⁷ CFU/mL) were frequently comparable, particularly for the 8.4 and 10.6 eV lamps. Several factors may account for the lack of a significant difference observed between the two bacterial contents. Firstly, bacterial growth often exhibits an exponential trend; therefore, VOC production may plateau in the stationary phase of bacterial growth, limiting further signal increases. Secondly, the sensor’s sensitivity is optimized for low current values. At high concentrations (10⁵ and 10⁷ CFU/mL), the sensor has reached saturation (around 150 nA), leading to a decreased sensitivity and a less pronounced relationship between concentration and signal.
When considering the mean signal obtained across the four ionization energies for a fixed concentration, a clear hierarchy was observed (P. aeruginosa > K. pneumoniae > E. coli > S. aureus) at higher concentrations. At lower concentrations (10²–10³ CFU/mL), this separation diminished, with P. aeruginosa, K. pneumoniae, and E. coli producing comparable signal intensities. This reduced discrimination is likely attributable to lower signal-to-noise ratios and increased relative variability near the detection limit, whereas S. aureus remained consistently lower across all concentrations.
However, for a fixed concentration, the relative signal intensity between species is not always preserved across ionization energies. In some cases, a species that exhibits a higher signal under one ionization condition may be surpassed by another at a different energy. This variability can be explained by differences in the VOC profiles of each species, particularly in terms of compound-specific ionization energies and abundances. As each lamp selectively ionizes compounds within a given energy range, shifts in ionization energy alter the subset of detectable VOCs, thereby affecting the relative signal intensities observed between species. For instance, E. coli exhibits the highest signal across all tested concentrations when using the 10.6 eV lamp. This behavior may be attributed to the preferential ionization of a subset of VOCs with ionization energies near 10.6 eV, which are more abundantly or selectively produced by E. coli. Consequently, this results in an enhanced signal under this specific ionization condition.
With respect to bacterial identification, our focus was on verifying whether the PID sensor signals contain sufficient discriminative information to distinguish bacterial species through AI-based modeling. The classification approach employed here follows an FSL strategy, implemented using Transfer Learning59 and Prototypical Networks (PN)60, applied to image-derived data. This proof-of-concept evaluation provides the foundation for advancing the framework toward more sophisticated learning paradigms.
To comprehensively assess the performance of the PN framework under data-scarce conditions, two complementary evaluation strategies were adopted. This dual approach enables evaluation under both realistic and controlled scenarios. The first strategy preserves the original data distribution, including class imbalance and inter-day variability, and serves as a baseline for assessing PN performance in an unbalanced setting. The second provides a controlled, class-balanced environment aligned with the FSL paradigm, enabling evaluation of the intrinsic capability of PN to generalize from limited examples without the influence of class imbalance.
The first approach uses the original dataset, comprising measurements acquired across multiple independent experimental days, thereby capturing variability associated with bacterial culturing and sensor acquisition (Supplementary Table S5). Given the limited sample size and class imbalance across all classes, Leave-One-Out Cross-Validation (LOOCV) was employed to maximize the use of available data while enabling unbiased evaluation of each sample. No class balancing techniques were applied, allowing the evaluation to reflect the model’s baseline performance under the natural class distribution. While such imbalance would typically be addressed in a deployment setting, it is intentionally retained here to assess performance under imbalanced data conditions. Samples from different acquisition days were randomly distributed across the support and query sets to ensure that inter-day variability was represented.
In the second approach, a synthetic dataset was generated to balance all classes, with 15 samples per class (Supplementary Table S5). Model evaluation followed an episodic scheme consistent with the PN framework, with support ((:{n}_{support}=5)) and query ((:{n}_{query}=1)) sets per class. Results were averaged over 100 iterations to reduce sampling variability and provide a stable estimate of performance under balanced, FSL conditions.
The first approach relies on the LOOCV method, as outlined in61. This process considers training the model N times, where N represents the dataset’s total number of data points. Each iteration excludes one data point from the training set, serving as the test set. The process repeats until every data point has served as the test set. Given the importance of obtaining a reliable performance estimate, particularly for small datasets, this method fits well within the scope of chemometric analysis62. To adapt this method for FSL, the same principle is applied by leaving one image out as the query set, while the remaining images form the support set, so that each image serves as the query exactly once.
Confusion matrix representing the prediction results of our model, focusing on bacteria and concentration classes for the first approach (LOOCV).
The PN model achieved an accuracy of 67.74% in classifying bacterial species and their respective concentrations. The confusion matrix displayed in Fig. 4 provides important insights. The model effectively distinguishes the control class (non-inoculated MH media) from all samples, with the exception of some misclassifications involving E. coli (EC) at a concentration of 102 CFU. Furthermore, it demonstrates strong performance in identifying various bacterial species, attaining an accuracy of 88.71%. A separate confusion matrix, which underscores the model’s capability in bacterial differentiation, is presented in Fig. 5.
Confusion matrix of the prediction results for the first approach, considering bacteria classes.
Despite the effectiveness of this approach in distinguishing between bacterial species, the model encounters challenges in differentiating between similar concentrations. For E. coli (EC), the lowest concentration (102) is frequently misclassified as either the lowest concentrations of K. pneumoniae (KP) and the control (MH) or the next higher concentration of EC itself (103). Other EC concentrations are misclassified with adjacent values rather than with other bacterial species. Similarly, for KP, lower concentrations are occasionally mistaken for the lowest EC concentration, whereas higher concentrations are consistently classified correctly. In the case of P. aeruginosa (PA), the lowest concentration is misclassified in 50% as the second-lowest concentration of S. aureus (SA), but higher concentrations are accurately identified. Likewise, SA exhibits occasional misclassifications at lower concentrations, and the second-highest concentration (105) is entirely confused with the highest one (107). Notably, misclassifications for SA occur only within the same species. Overall, the model successfully differentiates between low (102, 103) and high concentrations (105, 107), but for EC, it still struggles with mid-range concentrations, leading to some misclassifications.
For the second approach, we generated synthetic curves to balance the dataset using basic linear interpolation combined with random noise addition, ensuring that all classes had an equal number of samples: 15 per class. Following this, random sampling of small support (nsupport = 5) and query sets (nquery = 1) from the dataset simulates predefined FSL scenarios. This procedure repeats over 100 iterations.
Confusion matrix representing the prediction results of our model, focusing on bacteria and concentration classes for the second approach (nshot = 5, nquery = 1).
This approach showed an improved ability to distinguish samples with close concentrations. The generated data follow patterns similar to the original samples, providing greater consistency. As shown in the confusion matrix in Fig. 6, the model achieved an accuracy of 85.18% for detecting bacterial concentration combinations and 96.24% for detecting bacteria. A separate confusion matrix that highlights bacterial differentiation for this approach is shown in Fig. 7. This result underscores the model’s potential when the sensor generates more consistent data.
Confusion matrix of the prediction results for the second approach, considering bacteria classes.
To evaluate the effectiveness of FSL, we applied a range of traditional ML methods to the same problem to establish a benchmark. We do not compare our results with other FSL studies, as many rely on substantially larger datasets that permit fine-tuning of pre-trained models. Fine-tuning, whereby model parameters are further adjusted to align with the characteristics of the target data, has been shown to improve accuracy when sufficient labeled data are available63,64. Instead, this work highlights the challenge of developing models under data-scarce conditions, maintaining robustness, and reducing the risk of overfitting60 using a PN coupled with a frozen pre-trained ResNet18 backbone for feature extraction.
Feature engineering65 was employed to capture key information from each curve, such as the mean, maximum value, area under the curve, and time taken to reach the peak. Subsequently, PCA was used to reduce the number of features to four principal components while preserving 99.99% of the variance. This constituted the pipeline used for the traditional ML models utilized, Support Vector Machine (SVM), Random Forest (RF), and K-Nearest Neighbor (KNN). Initially, the LOOCV method was applied to the real data. Following that, an evaluation was conducted using nsupport = 5 and nquery = 1 on the balanced dataset to ensure a fair comparison with the FSL tests. The results for both approaches are presented in Tables 1 and 2, respectively.
SVM and FSL PN achieved the highest performance across both evaluation settings (Tables 1 and 2). Under the LOOCV protocol on the original dataset, both models obtained identical accuracy for the joint classification task (67.74%), while SVM achieved the highest accuracy for bacteria-only classification (91.94%), followed by FSL PN (88.71%). KNN and RF showed lower performance in both tasks. In the balanced, episodic evaluation, all models exhibited improved performance. FSL PN achieved the highest accuracy for both tasks (85.18% for joint classification and 96.24% for bacteria classification), followed by RF and SVM. KNN also showed a substantial increase in performance compared to the LOOCV setting. Overall, the transition from the original imbalanced dataset to the balanced synthetic dataset resulted in consistent performance gains across all models, with the most pronounced improvements observed in the joint classification task.
In this work, four bacterial species were distinguished based on their distinct current signals generated in the four PID sensors, which is a consequence of their different metabolic signatures. The primary objective of this study was to achieve bacterial differentiation using photoionization current profiling rather than to characterize or identify the specific VOCs involved. Within this context, several authors have conducted extensive VOC profiling of these bacterial species using GC-MS, such as Filipiak et al.16,17, Tait et al.34, or Fitzgerald et al.66. Although GC-MS allows for the chemical identification of a VOC profile, it is a laborious and time-consuming technique that does not enable real-time detection or direct analysis of the headspace of bacterial cultures. Also, it requires a pre-concentration step using SPME fibers or sorption tubes17,30. When compared to GC-MS, our developed sensor offers several advantages, including faster operation and reduced size. Furthermore, our designed PID sensor is both low-cost and transportable, with a total volume of only about 0.028 m³, with additional development opportunities for further miniaturization. Its compact size makes it highly suitable for practical applications where space and mobility are important. Moreover, since the chambers of all lamps, the electronic circuit, and the lamp drivers share a similar design, the overall sensor benefits from simplicity in construction and operation. This design approach not only reduces manufacturing complexity but also significantly contributes to the low cost and transportability of the device.
Notably, this sensor is also sensitive, allowing direct and real-time analysis of the headspace of bacterial cultures even at 102 CFU/mL without any pre-concentration. Its ability to detect bacterial levels as low as 10² CFU is clinically meaningful. HCAIs, such as bloodstream infections, typically involve bacterial loads of approximately 1–200 CFU67,68, while urinary tract infections often present with levels around 10⁵ CFU69,70. Therefore, this proof-of-concept sensor shows the potential to serve as a rapid early-detection tool for suspected infection cases, reducing the time required for pathogen identification relative to the time-consuming culturing techniques (usually take 24–48 h)71.
Faster detection and identification enable earlier initiation of appropriate antimicrobial therapy and reduce reliance on empirical broad-spectrum treatment. This is particularly relevant for antimicrobial stewardship, as timely pathogen identification supports de-escalation to targeted therapies and may help limit the development of antimicrobial resistance7,9,72. In addition, rapid detection can shorten the time to AST, thereby improving clinical decision-making. Furthermore, previous studies have demonstrated the potential of VOC signatures to differentiate between antibiotic-resistant and susceptible profiles18,27,73. In this context, the proposed PID-based system could, in future developments, leverage VOC profiling to support the identification of optimal antibiotic therapies.
Bacterial detection and identification through the emitted VOC pattern, although an indirect method, presents great potential since cell integrity is not compromised and metabolically active cells (viable) are detected even if they are in a dormant or not-culturable state74, which is not easily achieved by the culture and counting techniques. Another key strength of this study lies in the development of a lamp-based PID sensor that overcomes the limitations of traditional bacterial detection methods. By eliminating dependence on probe specificity and cross-reactivity, our sensor achieves real-time monitoring with rapid result acquisition. Furthermore, the application of a dedicated ML model allows for rapid and highly accurate bacterial differentiation, showcasing the power of this combined methodology.
As explored by other authors75,76,77 ML can greatly enhance the analysis of VOC data and facilitate the identification of bacterial VOC signatures. Palma et al. used ML to analyze previously reported literature75. Shauloff et al.76 developed a carbon-dot artificial nose aided by ML for differentiation among four bacterial species, requiring bacterial sample growth on a substrate with heating over a long incubation timeframe. Regardless of using a different sensor, the detection was limited to high bacterial concentrations (≥ 107 CFU/mL). Likewise, Henry et al. utilized ML to identify bacteria through excitation-emission spectroscopy, yet only high bacterial concentrations (≥ 106 CFU/mL) have resulted in high accuracy values77.
The PID sensor developed in this study generates current according to the concentrations of multiple VOCs present, without discriminating against VOC chemical classes. Although the current profiles of the various bacteria studied are distinct, drawing rapid conclusions would be challenging without tools to enhance data analysis. Hence, this study leveraged ML to improve data analysis and signal processing, accelerate the analytical process, and generate bacterial differentiation patterns by utilizing data from PID experiments and applying FSL techniques. The application of these techniques highlights the model’s potential, especially when coupled with more consistent sensor data, which further enhances the accuracy and reliability of bacterial classification. Notably, the model successfully achieved bacterial differentiation and detection of 102 CFU across all tested bacterial species. While some misclassifications occurred with E. coli when using only the real data, they were less frequent with the balanced approach. The model’s ability to distinguish between low (102−103 CFU) and high (105−107 CFU) bacterial loads is also significant. This capability could have practical clinical implications, for example, in urine samples, where urinary tract infections are typically defined by contamination levels around 10⁵–10⁶ CFU/mL69,70. Thus, the model could pave the way to assess, in real time, a positive infection detection and enable differentiation between true infection and non-infectious conditions.
SVM and FSL PN consistently achieved the strongest performance across the two evaluation settings. Under the LOOCV protocol on the original, imbalanced dataset, both models reached comparable accuracy for the joint classification task, while SVM showed slightly higher performance for bacteria-only classification. In contrast, KNN and RF exhibited lower accuracy across both tasks. These differences suggest that SVM and FSL PN are less affected by class imbalance in this setting, whereas KNN and RF appear more sensitive to skewed class distributions.
The transition to a balanced synthetic dataset resulted in a marked improvement in performance across all models. This consistent increase indicates that class imbalance constitutes a major limiting factor in the original dataset, particularly for the joint classification task. The more pronounced gains observed for KNN and RF further support the notion that these models are more strongly influenced by class frequency, while SVM and FSL PN maintain comparatively stable performance under imbalanced conditions.
Within this context, the behavior of the FSL approach is of particular interest. Although its performance under imbalanced conditions is comparable to that of SVM, FSL PN achieves the highest accuracy in the balanced episodic evaluation. This improvement indicates that the model can effectively leverage uniform class representation while maintaining consistent discrimination across classes. As the FSL framework relies on comparisons in an embedding space rather than global decision boundaries, it enables effective class discrimination from limited examples.
Overall, these findings highlight the critical role of class distribution in model performance and suggest that different algorithms exhibit varying sensitivity to imbalance in data-scarce settings. While conventional classifiers achieve strong performance under certain conditions, the FSL framework provides a competitive alternative that performs consistently across both imbalanced and balanced scenarios and achieves the highest accuracy in the balanced setting. In addition, FSL offers practical advantages, as it does not require retraining when new classes are introduced, enabling flexible adaptation to evolving datasets, while also reducing the risk of overfitting, which is more likely to occur in limited-data scenarios78. These characteristics make FSL a particularly suitable approach for classification tasks in low-data regimes60.
This balance between reliability and data efficiency highlights the suitability of the method for early-stage sensor development, where extensive labeled datasets are rarely available. Although preliminary, these findings suggest methodological characteristics that could become advantageous in future stages of sensor development, particularly in scenarios where labeled data remain scarce. In subsequent stages, as larger yet still data-constrained datasets become available, this framework could naturally evolve toward meta-learning strategies, implemented through episodic learning, enabling models to learn how to generalize across tasks and enhance adaptability in low-data regimes without extensive retraining45.
Future work will explore the detection of lower bacterial concentrations, potentially down to 1 CFU, by incorporating a pre-concentration sensor. Furthermore, testing other bacterial species relevant to HCAIs and analyzing mixed bacterial populations would provide a more robust dataset for AI algorithm training. This approach will help evaluate whether the AI-aided PID can effectively differentiate bacteria even in complex, mixed-population samples. In addition, future studies will include dedicated validation protocols to assess repeatability and day-to-day consistency, specifically by evaluating cross-day generalization, where models are trained and tested on data collected on separate experimental days. This will allow a more rigorous assessment of robustness to biological variability and potential sensor drift. Testing under more realistic conditions, such as those simulating surface-growing bacteria, biofilms, or using clinical samples, will offer valuable insights into the sensor’s applicability in clinical settings. This study provides a proof-of-principle for an AI-assisted PID sensor, underscoring the promise of AI-driven strategies to transform rapid bacterial detection and support their translation into clinical practice.
The core concept of the PID sensor developed here was previously described54. Each PID sensor includes a chamber specifically constructed to allow VOCs to pass through and a VUV lamp powered by a dedicated driver circuit (Fig. 1 – A and B). Xenon-filled (Heraeus PXR096) and Krypton-filled lamps (Heraeus PKR106) were utilized. During the ionization process, ions and free electrons are generated. An electric field was applied to collect the free electrons, creating an ionization current. Since this ionization current is in the nanoampere range, an integrating transimpedance amplifier (TIA) circuit was used to convert it into a measurable voltage. The resulting analog voltage signals were digitized using a Digital I/O module (782608-01, National Instruments) and analyzed with LabVIEW software, which provided the current integrated over a two-second sampling interval. The control signals for the TIA circuit and the Mass Flow Controllers (MFCs) were transmitted from the computer through the National Instruments (NI) cDAQ-9174 USB chassis and the NI-9201, NI-9263, and NI-9401 input/output modules to the sensor.
The primary difference among the four PID sensors lies in the energy of the lamps they incorporate to ionize the multiple VOCs emitted by each bacterial species. According to the literature, the VOCs most commonly emitted by bacteria have ionization energies ranging from 7.7 to 13.6 eV (Supplementary Tables S1–S4). Thus, four levels of ionization energy within this range were selected: 8.4 eV (Xenon-filled lamp with a sapphire window), 9.6 eV (Xenon-filled lamp), 10 eV (Krypton-filled lamp with a calcium fluoride window), and 10.6 eV (Krypton-filled lamp). In each sensor, only VOC molecules with IPs lower than the energy of the respective lamp can be ionized and detected. Therefore, the behavior of each bacterial species in each sensor is unique.
The generated current (or measured voltage) depends on several factors, including the concentration and IP of VOCs, the illumination intensity of the lamp, the lamp’s ionization energy, and the applied electric field used to collect the ions. The illumination intensities of Xenon-filled and Krypton-filled lamps differ, and the use of windows to achieve photon energies of 8.4 eV and 10 eV further reduces their intensities. Due to these multiple factors and the non-linear relationship between them and the output signal, it was not possible to configure the lamps to produce the same signal value for identical VOC concentrations. Therefore, the applied electric field for each lamp was adjusted independently, with the primary requirement being that the signal remained detectable even at low bacterial concentrations (102 CFU/mL). This led to the final decision of applying 100 V to the 8.4 eV lamp and 30 V to the 9.6, 10, and 10.6 eV lamps.
The measurement system has a maximum readable current of approximately 150 nA. When bacterial VOCs production generated currents exceeding this threshold, the readout electronics were unable to register higher values. In such cases, the recorded signal was limited to 150 nA. This value is defined as the saturation current, representing the instrumental limit of the readout system rather than a true saturation of bacterial VOC emission.
The strains used in this study were purchased from the Spanish Type Culture Collection (CECT). Four bacterial strains belonging to the target species were selected: Escherichia coli CECT 434, Staphylococcus aureus CECT 240, Pseudomonas aeruginosa CECT 118, and Klebsiella pneumoniae CECT 7787. Bacterial strains were recovered from the frozen stocks and grown in Tryptic Soy Agar (Biokar) plates overnight at 37 °C. For pre-inoculum preparation, each strain was aseptically inoculated into 5 mL of Mueller-Hinton (MH) broth (Biokar) and grown overnight at 37 °C in an orbital shaker (VWR) at 120 rpm. Optical density was measured using a Synergy H1 Microplate Reader (BioTek) and adjusted to a range of 0.2–0.5 based on each bacterial growth curve. Serial dilutions of bacterial cells (102−107) were prepared from the bacterial suspension containing approximately 108 Colony-forming units (CFU) per mL in MH. Exact bacterial concentrations (in CFU/mL) were assessed by plating 10 µL of each dilution on MH agar plates in duplicate, in each experiment. As representatives, four bacterial concentrations were used in the PID experiments, namely 102, 103, 105, 107 CFU/mL. These were chosen to represent both low and high bacterial loads that are clinically relevant79,80,81.
A homemade setup was developed for bacterial VOC detection, allowing VOCs to be retained as bacteria proliferate and emit them (Fig. 1). Flasks (SCHOTT) were equipped with GL45 screw caps containing 2 distributors and connected to screw joints (Bohlender). One distributor intended to insert compressed air (inlet) to carry the bacterial VOCs, and the other distributor was used as an outlet to release the bacterial VOCs to the PID. A 0.22 μm filter (Fisher) was placed in the outlet to avoid bacterial contamination. Stainless steel pipes were used in the outlet, while polytetrafluoroethylene (PTFE) tubing was used for compressed air. Valves (Swagelok) were placed in both the inlet and outlet and closed during bacterial growth. In these experiments, 100 µL of the dilutions mentioned above containing varied bacterial concentrations (102, 103, 105, 107 CFU/mL) were spread into a 100 mL flask containing a layer (approx. 15 mL) of MH Agar (Biokar). The flasks were left overnight (approx. 16 h) in a static incubator (Termaks) at 37 °C. Flasks with MH agar media non-inoculated with bacteria (negative control) were prepared and subjected to the same procedure. During overnight incubation, the bacteria emitted several VOCs, which were retained inside the flask by the closed valves. After connecting each flask to the PID sensor and opening the valves, a rotameter (Aalborg) set to 200 mL/min allowed the flow of compressed air through the bacteria flask, and four MFCs (Aalborg) placed downstream to the flask were set to 50 mL/min to carry the VOCs to each of the lamp-PID chambers (as represented in Fig. 1-C). VOC flow paths for all four sensors were kept parallel, with controlled and equal flow applied to each to ensure independent performance. The VOC measurements were performed continuously for 20 min, and data was acquired by LabVIEW 2023 software.
Figures 2 and 3 present data obtained from at least three independent experiments, used to calculate the mean and standard deviation (SD). In Fig. 3, a two-way ANOVA followed by Tukey’s multiple comparisons test was used to assess differences between the signals obtained from bacterial samples and the control (non-inoculated culture medium). The same analysis was also applied to compare different concentrations within each bacterial species. Statistical significance was considered when the p-value was less than 0.05.
We explored an image-based approach that represents data in a way that enables the prediction and differentiation of bacterial species. The objective was not only to achieve high accuracy but also to ensure robustness while mitigating overfitting, a challenge particularly critical in data-scarce scenarios. During the preprocessing step, each sample consisted of electrical output data generated by four PID sensors, with each curve color corresponding to a specific lamp energy level. This step improved image consistency by setting the sensor’s saturation current to 150 nA, thereby limiting any recorded value to this maximum. This choice allows us to retain useful information from curves that saturate at different rates, as some signals reach saturation rapidly while others remain below the limit throughout the experiment. Using this threshold avoids prematurely truncating the signals and helps preserve the variability of the curves for the subsequent image-based analysis. Additionally, the area under each curve was filled to enhance visual representation. Finally, the data from all four images were combined to create a single image sample. Once the electrical data were transformed into images (Figure S4), Transfer Learning59 was utilized to create robust feature representations for the image samples in this data-scarce environment. A pre-trained CNN, specifically ResNet-1853, was employed to process these image samples, effectively converting them into feature vectors (embeddings).
The classification method follows the FSL framework and applies PN60. This approach involves three main concepts: the support set, the query set, and prototypes. The support set consists of a selection of labeled examples that serve as references for the model’s predictions, with each class having its own distinct support set. The model computes the centroid of each support set to create a prototype that represents the corresponding class. Finally, PN classifies unlabeled data from the query set by matching each query sample to the nearest prototype using Euclidean distance, thereby simultaneously facilitating performance evaluation of the model44. Additional details on the implementation of the model are provided in Supplementary Note 1. Figure 8 illustrates an application of FSL for classification with PN.
Pipeline of the process and overall architecture of the five-shot learning model, illustrating a metric-based setup for a multi-class classification task, where Xi represents an input tensor of the image and fi denotes the output feature vector extracted by ResNet-18. Each support and query example is processed individually by the embedding CNN to generate its corresponding feature vector, which is then used to compute distances in the embedding space for classification.
The computational cost of our approach is relatively low because we do not train the model from scratch63,64 but instead, use a pre-trained ResNet18. The main operations involve a forward pass through ResNet18 to extract image embeddings and subsequent distance calculations between prototypes in the PN. These steps are far lighter than full model training, making the method computationally efficient. All experiments were run locally on an AMD Ryzen 7 PRO 4750U CPU, demonstrating that the approach does not require high-end hardware. On average, the sensor classifies a single query image in approximately 1.8 s (covering the forward pass, prototype computation, and classification within one episode with nquery = 1 and nshot = 5).
All processed data are available in the main text or the supplementary materials.
Balasubramanian Id, R., Van Boeckel, T. P., Carmeli, Y., Cosgrove, S. & Laxminarayan, R. Global Incidence in Hospital-Associated Infections Resistant to Antibiotics: An Analysis of Point Prevalence Surveys from 99 Countries. PLoS Med. 20, e1004178. https://doi.org/10.1371/journal.pmed.1004178 (2023).
Article Google Scholar
Abban, M. K., Ayerakwa, E. A., Mosi, L. & Isawumi, A. The Burden of Hospital Acquired Infections and Antimicrobial Resistance. Heliyon 9, e20561. https://doi.org/10.1016/j.heliyon.2023.e20561 (2023).
Article PubMed PubMed Central Google Scholar
Horan, T. C., Andrus, M. & Dudeck, M. A. CDC/NHSN Surveillance Definition of Health Care-Associated Infection and Criteria for Specific Types of Infections in the Acute Care Setting. Am. J. Infect. Control. 36, 309–332. https://doi.org/10.1016/j.ajic.2008.03.002 (2008).
Article PubMed Google Scholar
Klevens, R. M. et al. Estimating Health Care-Associated Infections and Deaths in U.S. Hospitals, 2002. Public. Health Rep. 122, 160–166. https://doi.org/10.1177/003335490712200205 (2007).
Article PubMed PubMed Central Google Scholar
European Centre for Disease Prevention and Control (ECDC). Point Prevalence Survey of Healthcare-Associated Infections and Antimicrobial Use in European Long-Term Care Facilities; ; (2025).
Sandu, A. M. et al. Healthcare-Associated Infections: The Role of Microbial and Environmental Factors in Infection Control-A Narrative Review. 14, 933–971, (2025). https://doi.org/10.1007/s40121-025-01143-0
Pogue, J. M., Kaye, K. S., Cohen, D. A. & Marchaim, D. Appropriate Antimicrobial Therapy in the Era of Multidrug-Resistant Human Pathogens. Clin. Microbiol. Infect. 21, 302–312. https://doi.org/10.1016/J.CMI.2014.12.025 (2015).
Article CAS PubMed Google Scholar
Timbrook, T. T. et al. The Effect of Molecular Rapid Diagnostic Testing on Clinical Outcomes in Bloodstream Infections: A Systematic Review and Meta-Analysis. Clin. Infect. Dis. 64, 15–23. https://doi.org/10.1093/CID/CIW649 (2017).
Article PubMed Google Scholar
Lambregts, M. M. C., Bernards, A. T., van der Beek, M. T., Visser, L. G. & de Boer, M. G. Time to Positivity of Blood Cultures Supports Early Re-Evaluation of Empiric Broad-Spectrum Antimicrobial Therapy. PLoS One. 14 https://doi.org/10.1371/JOURNAL.PONE.0208819 (2019).
Rajapaksha, P. et al. A Review of Methods for the Detection of Pathogenic Microorganisms. Analyst 144, 396–411. https://doi.org/10.1039/C8AN01488D (2019).
Article ADS CAS PubMed Google Scholar
Váradi, L. et al. Methods for the Detection and Identification of Pathogenic Bacteria: Past, Present, and Future. Chem. Soc. Rev. 46, 4818–4832. https://doi.org/10.1039/C6CS00693K (2017).
Article PubMed Google Scholar
Schulz, S., Dickschat, J. S. & Bacterial Volatiles The Smell of Small Organisms. Nat. Prod. Rep. 24, 814–842. https://doi.org/10.1039/B507392H (2007).
Article CAS PubMed Google Scholar
Epping, R. & Koch, M. On-Site Detection of Volatile Organic Compounds (VOCs). Molecules 28, 1598. https://doi.org/10.3390/MOLECULES28041598 (2023).
Article CAS PubMed PubMed Central Google Scholar
Bos, L. D. J., Sterk, P. J. & Schultz, M. J. Volatile Metabolites of Pathogens: A Systematic Review. PLoS Pathog. 9, e1003311. https://doi.org/10.1371/journal.ppat.1003311 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ratiu, I. A. et al. An Optimistic Vision of Future: Diagnosis of Bacterial Infections by Sensing Their Associated Volatile Organic Compounds. Crit. Rev. Anal. Chem. 50, 501–512. https://doi.org/10.1080/10408347.2019.1663147 (2020).
Article CAS PubMed Google Scholar
Filipiak, W. et al. Molecular Analysis of Volatile Metabolites Released Specifically by Staphylococcus Aureus and Pseudomonas Aeruginosa. BMC Microbiol. 12 https://doi.org/10.1186/1471-2180-12-113 (2012).
Filipiak, W. et al. GC-MS Profiling of Volatile Metabolites Produced by Klebsiella Pneumoniae. Front. Mol. Biosci. 9, 1019290. https://doi.org/10.3389/fmolb.2022.1019290 (2022).
Article CAS PubMed PubMed Central Google Scholar
Hewett, K. et al. De Lacy Costello, B. Towards the Identification of Antibiotic-Resistant Bacteria Causing Urinary Tract Infections Using Volatile Organic Compounds Analysis-A Pilot Study. Antibiotics 9, 797. https://doi.org/10.3390/antibiotics9110797 (2020).
Article CAS PubMed PubMed Central Google Scholar
Luo, H. et al. Rapid Identification of Carbapenemase-Producing Klebsiella Pneumoniae Using Headspace Solid-Phase Microextraction Combined with Gas Chromatography-Mass Spectrometry. Infect. Drug Resist. 16, 2601–2609. https://doi.org/10.2147/IDR.S404742 (2023).
Article CAS PubMed PubMed Central Google Scholar
Kunze-Szikszay, N. et al. Headspace Analyses Using Multi-Capillary Column-Ion Mobility Spectrometry Allow Rapid Pathogen Differentiation in Hospital-Acquired Pneumonia Relevant Bacteria. BMC Microbiol. 21, 69. https://doi.org/10.1186/S12866-021-02102-8/FIGURES/3 (2021).
Article CAS PubMed PubMed Central Google Scholar
Dias, T. et al. A Lab-Made E-Nose-MOS Device for Assessing the Bacterial Growth in a Solid Culture Medium. Biosens. (Basel). 13 https://doi.org/10.3390/bios13010019 (2023).
Reidt, U. et al. Detection of Microorganisms with an Electronic Nose for Application under Microgravity Conditions. Gravitational Space Res. 8, 1–17. https://doi.org/10.2478/gsr-2020-0001 (2020).
Article ADS Google Scholar
Reidt, U. et al. Detection of Microorganisms Onboard the International Space Station Using an Electronic Nose. Gravitational Space Res. 5, 89–111. https://doi.org/10.2478/gsr-2017-0013 (2017).
Article ADS Google Scholar
Bous, M. et al. Detection of Volatile Organic Compounds in Headspace of Klebsiella Pneumoniae and Klebsiella Oxytoca Colonies. Front. Pediatr. 11, 1151000. https://doi.org/10.3389/fped.2023.1151000 (2023).
Article PubMed PubMed Central Google Scholar
Kunze, N. et al. Detection and Validation of Volatile Metabolic Patterns over Different Strains of Two Human Pathogenic Bacteria during Their Growth in a Complex Medium Using Multi-Capillary Column-Ion Mobility Spectrometry (MCC-IMS). Appl. Microbiol. Biotechnol. 97, 3665–3676. https://doi.org/10.1007/s00253-013-4762-8 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhu, J., Bean, H. D., Kuo, Y. M. & Hill, J. E. Fast Detection of Volatile Organic Compounds from Bacterial Cultures by Secondary Electrospray Ionization-Mass Spectrometry. J. Clin. Microbiol. 48, 4426–4431. https://doi.org/10.1128/JCM.00392-10/FORMAT/EPUB (2010).
Article CAS PubMed PubMed Central Google Scholar
Li, H. & Zhu, J. Differentiating Antibiotic-Resistant Staphylococcus Aureus Using Secondary Electrospray Ionization Tandem Mass Spectrometry. Anal. Chem. 90, 12108–12115. https://doi.org/10.1021/ACS.ANALCHEM.8B03029/ASSET/IMAGES/LARGE/AC-2018-030292_0006.JPEG (2018).
Article ADS CAS PubMed Google Scholar
Agbroko, S. O., Covington, J. A. & Novel Low-Cost, Portable PID Sensor for the Detection of Volatile Organic Compounds. Sens. Actuators B Chem. 275, 10–15. https://doi.org/10.1016/j.snb.2018.07.173 (2018).
Article ADS CAS Google Scholar
Rees, C. A. et al. Detection of High-Risk Carbapenem-Resistant Klebsiella Pneumoniae and Enterobacter Cloacae Isolates Using Volatile Molecular Profiles. Sci. Rep. 8, 13297. https://doi.org/10.1038/s41598-018-31543-x (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Rees, C. A., Franchina, F. A., Nordick, K. V., Kim, P. J. & Hill, J. E. Expanding the Klebsiella Pneumoniae Volatile Metabolome Using Advanced Analytical Instrumentation for the Detection of Novel Metabolites. J. Appl. Microbiol. 122, 785–795. https://doi.org/10.1111/jam.13372 (2017).
Article CAS PubMed Google Scholar
Ródenas García, M. et al. Review of Low-Cost Sensors for Indoor Air Quality: Features and Applications. Appl. Spectrosc. Rev. 57, 747–779. https://doi.org/10.1080/05704928.2022.2085734 (2022).
Article ADS Google Scholar
Pang, X. et al. Low-Cost Photoionization Sensors as Detectors in GC × GC Systems Designed for Ambient VOC Measurements. Sci. Total Environ. 664, 771–779. https://doi.org/10.1016/j.scitotenv.2019.01.348 (2019).
Article ADS CAS PubMed Google Scholar
Jian, R. S., Sung, L. Y. & Lu, C. J. Measuring Real-Time Concentration Trends of Individual VOC in an Elementary School Using a Sub-Ppb Detection ΜGC and a Single GC–MS Analysis. Chemosphere 99, 261–266. https://doi.org/10.1016/J.CHEMOSPHERE.2013.10.094 (2014).
Article ADS CAS PubMed Google Scholar
Tait, E., Perry, J. D., Stanforth, S. P. & Dean, J. R. Identification of Volatile Organic Compounds Produced by Bacteria Using HS-SPME-GC-MS. J. Chromatogr. Sci. 52, 363–373. https://doi.org/10.1093/chromsci/bmt042 (2014).
Article CAS PubMed Google Scholar
Almasoud, N. et al. Discrimination of Bacteria Using Whole Organism Fingerprinting: The Utility of Modern Physicochemical Techniques for Bacterial Typing. Analyst 146, 770–788. https://doi.org/10.1039/D0AN01482F (2021).
Article ADS CAS PubMed Google Scholar
Locke, A., Fitzgerald, S. & Mahadevan-Jansen, A. Advances in Optical Detection of Human-Associated Pathogenic Bacteria. Molecules 25, 5256. https://doi.org/10.3390/MOLECULES25225256 (2020).
Article CAS PubMed PubMed Central Google Scholar
Wang, L. et al. Applications of Raman Spectroscopy in Bacterial Infections: Principles, Advantages, and Shortcomings. Front. Microbiol. 12, 683580. https://doi.org/10.3389/FMICB.2021.683580/TEXT (2021).
Article PubMed PubMed Central Google Scholar
Chícharo, A. et al. Precision-Engineered Plasmonic Nanostar Arrays for High-Performance SERS Sensing. Adv. Opt. Mater. 13, e01275. https://doi.org/10.1002/ADOM.202501275 (2025).
Article Google Scholar
Zhang, C. et al. Volatilomics Analysis of Jasmine Tea during Multiple Rounds of Scenting Processes. Foods 12, 812. https://doi.org/10.3390/FOODS12040812/S1 (2023).
Article CAS PubMed PubMed Central Google Scholar
Chen, G. F., Lai, C. H. & Chen, W. H. Principal Component Analysis and Mapping to Characterize the Emission of Volatile Organic Compounds in a Typical Petrochemical Industrial Park. Aerosol Air Qual. Res. 20, 465–476. https://doi.org/10.4209/AAQR.2019.07.0365 (2020).
Article Google Scholar
Arora, M. et al. Machine Learning Approaches to Identify Discriminative Signatures of Volatile Organic Compounds (VOCs) from Bacteria and Fungi Using SPME-DART-MS. Metabolites 12, 232. https://doi.org/10.3390/METABO12030232 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zhou, B. et al. A Lightweight Convolutional Neural Network for Bacterial Identification Based on Raman Spectra. RSC Adv. 12, 26463. https://doi.org/10.1039/d2ra03722j (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, Y. et al. Rapid and Accurate Identification of Marine Microbes with Single-Cell Raman Spectroscopy. Analyst 145, 3297. https://doi.org/10.1039/c9an02069a (2020).
Article ADS CAS PubMed Google Scholar
Zeng, W., Xiao, Z., Zeng, W. & Xiao, Z. Few-Shot Learning Based on Deep Learning: A Survey. Mathematical Biosciences and Engineering 21, 679–711, (2024). https://doi.org/10.3934/MBE.2024029
Gharoun, H., Momenifar, F., Chen, F. & Gandomi, A. H. Meta-Learning Approaches for Few-Shot Learning: A Survey of Recent Advances. ACM Comput. Surv. 56, 1–41. https://doi.org/10.1145/3659943/ASSET/A4843494-3BFB-4FA5-B394-E26841DD7583/ASSETS/IMAGES/LARGE/CSUR-2023-0062-F27.JPG (2024).
Article Google Scholar
Mi, F. et al. Recent Advancements in Microfluidic Chip Biosensor Detection of Foodborne Pathogenic Bacteria: A Review. Anal. Bioanal Chem. 414, 2883. https://doi.org/10.1007/S00216-021-03872-W (2022).
Article CAS PubMed PubMed Central Google Scholar
Gangwar, R. et al. Plasma Functionalized Carbon Interfaces for Biosensor Application: Toward the Real-Time Detection of Escherichia Coli O157: H7. ACS Omega. 7, 21025–21034. https://doi.org/10.1021/ACSOMEGA.2C01802/ASSET/IMAGES/MEDIUM/AO2C01802_M001.GIF (2022).
Article CAS PubMed PubMed Central Google Scholar
Izadi, M. & Arvand, M. An Aptamer-Functionalized AuNPs/RGO Nanocomposite Biosensor for Ultrasensitive Detection of Foodborne Pathogen E. Coli O157:H7. Sci. Rep. 16, 2701, (2026). https://doi.org/10.1038/s41598-025-32516-7
Costa, S. P. et al. A Microfluidic Platform Combined with Bacteriophage Receptor Binding Proteins for Multiplex Detection of Escherichia Coli and Pseudomonas Aeruginosa in Blood. Sens. Actuators B Chem. 376, 132917. https://doi.org/10.1016/J.SNB.2022.132917 (2023).
Article CAS Google Scholar
Muller, V. et al. Identification of Pathogenic Bacteria in Complex Samples Using a Smartphone Based Fluorescence Microscope. RSC Adv. 8, 36493. https://doi.org/10.1039/c8ra06473c (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Storer, M. K., Hibbard-Melles, K., Davis, B. & Scotter, J. Detection of Volatile Compounds Produced by Microbial Growth in Urine by Selected Ion Flow Tube Mass Spectrometry (SIFT-MS). J. Microbiol. Methods. 87, 111–113. https://doi.org/10.1016/j.mimet.2011.06.012 (2011).
Article CAS PubMed Google Scholar
Allardyce, R. A., Hill, A. L. & Murdoch, D. R. The Rapid Evaluation of Bacterial Growth and Antibiotic Susceptibility in Blood Cultures by Selected Ion Flow Tube Mass Spectrometry. Diagn. Microbiol. Infect. Dis. 55, 255–261. https://doi.org/10.1016/j.diagmicrobio.2006.01.031 (2006).
Article CAS PubMed Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 770–778, (2016). https://doi.org/10.1109/CVPR.2016.90
Miranda, A. & De Beule, P. A. A. Atmospheric Photoionization Detector with Improved Photon Efficiency: A Proof of Concept for Application of a Nanolayer Thin-Film Electrode. Sensors 21, 7738. https://doi.org/10.3390/s21227738 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Boots, A. W., Bos, L. D., van der Schee, M. P., van Schooten, F. J. & Sterk, P. J. Exhaled Molecular Fingerprinting in Diagnosis and Monitoring: Validating Volatile Promises. Trends Mol. Med. 21, 633–644. https://doi.org/10.1016/J.MOLMED.2015.08.001 (2015).
Article CAS PubMed Google Scholar
Zscheppank, C., Wiegand, H. L., Lenzen, C., Wingender, J. & Telgheder, U. Investigation of Volatile Metabolites during Growth of Escherichia Coli and Pseudomonas Aeruginosa by Needle Trap-GC-MS. Anal. Bioanal Chem. 406, 6617–6628. https://doi.org/10.1007/s00216-014-8111-2 (2014).
Article CAS PubMed Google Scholar
Goeminne, P. C. et al. Detection of Pseudomonas Aeruginosa in Sputum Headspace through Volatile Organic Compound Analysis. Respir Res. 13, 87. https://doi.org/10.1186/1465-9921-13-87 (2012).
Article CAS PubMed PubMed Central Google Scholar
Davis, T. J. et al. Pseudomonas Aeruginosa Volatilome Characteristics and Adaptations in Chronic Cystic Fibrosis Lung Infections. mSphere 5, e00843–20, (2020). https://doi.org/10.1128/mSphere
Neyshabur, B., Sedghi, H. & Zhang, C. What Is Being Transferred in Transfer Learning? In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.; Neural information processing systems foundation, ; Vol. 2020-December. (2020).
Snell, J., Swersky, K. & Zemel, T. R. Prototypical Networks for Few-Shot Learning. In Proceedings of the Advances in Neural Information Processing Systems; 30. (2017).
Cheng, H., Garrick, D. J. & Fernando, R. L. Efficient Strategies for Leave-One-out Cross Validation for Genomic Best Linear Unbiased Prediction. J. Anim. Sci. Biotechnol. 8, 38. https://doi.org/10.1186/S40104-017-0164-6/TABLES/5 (2017).
Article PubMed PubMed Central Google Scholar
Beleites, C. & Salzer, R. Assessing and Improving the Stability of Chemometric Models in Small Sample Size Situations. Anal. Bioanal Chem. 390, 1261–1271. https://doi.org/10.1007/S00216-007-1818-6/FIGURES/11 (2008).
Article CAS PubMed Google Scholar
Kornblith, S., Shlens, J. & Le, Q. V. Do Better ImageNet Models Transfer Better? In Proceedings of the Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; IEEE Computer Society, Vol. 2019-June, pp. 2656–2666.
Kim, H. E. et al. Transfer Learning for Medical Image Classification: A Literature Review. BMC Med. Imaging. 22, 69. https://doi.org/10.1186/S12880-022-00793-7 (2022).
Article PubMed PubMed Central Google Scholar
Dong, G. & Liu, H. Feature Engineering for Machine Learning and Data Analytics; CRC Press, Taylor and Francis, ISBN 1351721275. (2018).
Fitzgerald, S., Duffy, E., Holland, L. & Morrin, A. Multi-Strain Volatile Profiling of Pathogenic and Commensal Cutaneous Bacteria. Sci. Rep. 10, 17971. https://doi.org/10.1038/s41598-020-74909-w (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Skvarc, M., Stubljar, D., Rogina, P. & Kaasch, A. J. Non-Culture-Based Methods to Diagnose Bloodstream Infection: Does It Work? Eur. J. Microbiol. Immunol. (Bp). 3, 97. https://doi.org/10.1556/EUJMI.3.2013.2.2 (2013).
Article PubMed PubMed Central Google Scholar
Yagupsky, P. & Nolte, F. S. Quantitative Aspects of Septicemia. Clin. Microbiol. Rev. 3, 269–279. https://doi.org/10.1128/CMR.3.3.269 (1990).
Article CAS PubMed PubMed Central Google Scholar
Kranz, J. et al. European Association of Urology Guidelines on Urological Infections: Summary of the 2024 Guidelines. Eur. Urol. 86, 27–41. https://doi.org/10.1016/J.EURURO.2024.03.035 (2024).
Article PubMed Google Scholar
Hay, A. D. et al. Microbiological Diagnosis of Urinary Tract Infection by NHS and Research Laboratories. Pediatrics 125, 335–341. https://doi.org/10.1542/PEDS.2008-1455 (2016).
Article Google Scholar
Ferone, M., Gowen, A., Fanning, S. & Scannell, A. G. M. Microbial Detection and Identification Methods: Bench Top Assays to Omics Approaches. Compr. Rev. Food Sci. Food Saf. 19, 3106–3129. https://doi.org/10.1111/1541-4337.12618 (2020).
Article PubMed Google Scholar
Guo, Y., Gao, W., Yang, H., Ma, C. & Sui, S. De-Escalation of Empiric Antibiotics in Patients with Severe Sepsis or Septic Shock: A Meta-Analysis. Heart Lung. 45, 454–459. https://doi.org/10.1016/J.HRTLNG.2016.06.001 (2016).
Article PubMed Google Scholar
Smart, A. et al. Sniffing out Resistance – Rapid Identification of Urinary Tract Infection-Causing Bacteria and Their Antibiotic Susceptibility Using Volatile Metabolite Profiles. J. Pharm. Biomed. Anal. 167, 59–65. https://doi.org/10.1016/J.JPBA.2019.01.044 (2019).
Article ADS CAS PubMed Google Scholar
Roda, B. et al. New Analytical Platform Based on Field-Flow Fractionation and Olfactory Sensor to Improve the Detection of Viable and Non-Viable Bacteria in Food. Anal. Bioanal Chem. 408, 7367–7377. https://doi.org/10.1007/S00216-016-9836-X/FIGURES/8 (2016).
Article CAS PubMed Google Scholar
Palma, S. I. C. J. et al. A. Machine Learning for the Meta-Analyses of Microbial Pathogens’ Volatile Signatures. Sci. Rep. 8, 3360. https://doi.org/10.1038/s41598-018-21544-1 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Shauloff, N. et al. Sniffing Bacteria with a Carbon-Dot Artificial Nose. Nanomicro Lett. 13, 112. https://doi.org/10.1007/s40820-021-00610-w (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Henry, J., Endres, J. L., Sadykov, M. R., Bayles, K. W. & Svechkarev, D. Fast and Accurate Identification of Pathogenic Bacteria Using Excitation-Emission Spectroscopy and Machine Learning. Sens. Diagnostics. 3, 1253–1262. https://doi.org/10.1039/d4sd00070f (2024).
Article CAS Google Scholar
Song, C., Ristenpart, T. & Shmatikov, V. Machine Learning Models That Remember Too Much. In Proceedings of the Proceedings of the ACM Conference on Computer and Communications Security; Association for Computing Machinery, pp. 587–601. (2017).
Sibila, O. et al. Airway Bacterial Load and Inhaled Antibiotic Response in Bronchiectasis. Am. J. Respir Crit. Care Med. 200, 33–41. https://doi.org/10.1164/RCCM.201809-1651OC (2019).
Article CAS PubMed Google Scholar
Burd, E. M. & Kehl, K. S. A Critical Appraisal of the Role of the Clinical Microbiology Laboratory in the Diagnosis of Urinary Tract Infections. J. Clin. Microbiol. 49, 34–38. https://doi.org/10.1128/JCM.00788-11 (2011).
Article Google Scholar
Yagupsky, P. & Nolte, F. S. Quantitative Aspects of Septicemia. Clin. Microbiol. Rev. 3, 269–279. https://doi.org/10.1128/CMR.3.3.269 (1990).
Article CAS PubMed PubMed Central Google Scholar
Download references
The authors acknowledge the financial support of the project SMARTgNOSTICS, with the reference n.º C644915155-00000024, co-funded by Component C5 – Capitalisation and Business Innovation under the Portuguese Resilience and Recovery Plan, through the NextGenerationEU Fund.
These authors contributed equally to this work: Susana P. Costa and António Cardoso.
International Iberian Nanotechnology Laboratory, Avenida Mestre José Veiga s/n, Braga, 4715-330, Portugal
Susana P. Costa, Hedieh Mahmoodnia, Fábio Gonçalves, Adelaide Miranda & Pieter De Beule
INESC TEC, Rua Dr. Roberto Frias, Porto, 4200-465, Portugal
António Cardoso
Faculdade de Engenharia, INESC TEC, Universidade do Porto, Rua Dr. Roberto Frias, Porto, 4200-465, Portugal
Felipe Yamada & Luís Guimarães
Faculdade de Economia, INESC TEC, Universidade do Porto, Rua Dr. Roberto Frias, Porto, 4200-464, Portugal
Flávia Barbosa
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
S.P.C., H.M., and F.G. designed and performed the experimental work. A.C. and F.Y. performed algorithm development; S.P.C., A.C., and F.Y. analyzed the data. S.P.C., H.M., and A.C. wrote the manuscript. All authors read, reviewed, and approved the final manuscript.
Correspondence to Flávia Barbosa or Pieter De Beule.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Below is the link to the electronic supplementary material.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Reprints and permissions
Costa, S.P., Cardoso, A., Mahmoodnia, H. et al. Bacterial species differentiation via real-time detection of microbial volatile organic compounds using a wavelength multiplexed photoionization detector and AI image-based analysis. Sci Rep 16, 15924 (2026). https://doi.org/10.1038/s41598-026-46818-x
Download citation
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-026-46818-x
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Advertisement
Scientific Reports (Sci Rep)
ISSN 2045-2322 (online)
© 2026 Springer Nature Limited
Sign up for the Nature Briefing: Microbiology newsletter — what matters in microbiology research, free to your inbox weekly.

Leave a Reply