AI-Powered Gas Detection & Classification System

System Overview

Gas Classification
Machine learning models trained with drift characteristics to predict gas types across 4 classes: NoGas, Perfume, Smoke, and Mixture.
Drift Diagnosis
Temporal analysis using ΔR, |ΔR|, and EMA characteristics to detect sensor drift and failures above 15% threshold.
Leak Detection
Threshold-based logic monitoring concentration levels, duration, and rapid changes to identify dangerous leaks.
Automated Alerts
Real-time webhook integration for instant notifications when confirmed leaks or critical drift events are detected.
Important Note

Gas classification and leak detection are independent processes. The system can classify "NoGas" while still detecting a leak if concentration exceeds safety thresholds, or classify "Smoke" without detecting a leak if concentration remains low.

Enterprise Integration Architecture

Simplified System Architecture

flowchart LR
    subgraph Sensors["Sensors"]
        S[7 MQ Sensors
MQ2-MQ135
Raw Data Collection] end subgraph Orchestrator["Orchestrator"] N8N[n8n
Event Routing
Data Encapsulation
Logging] end subgraph Classifier["ML Classifier"] Models[8 ML Algorithms
Random Forest: 99.58%
SVM, KNN, MLP, DT
Classification & Scoring] LG[LangGraph
Agent Coordination
Decision Logic] end subgraph Data["Data Layer"] VDB[Vector DB
Semantic Search
Pattern Storage] TS[Time-Series DB
Drift Tracking
Temporal Evolution] end subgraph Analytics["AI Analytics"] AN[Anomaly Detection
Predictive Maintenance
Root Cause Analysis
Adaptive Alerts] end subgraph MCP_Layer["MCP Integration"] MCP[Model Context
Protocol
Action Generation] end subgraph Enterprise["Enterprise Systems"] SAP[SAP ERP
Maintenance] Teams[MS Teams
Personnel Interaction] Ext[External
Systems] end S -->|Raw Data| N8N N8N -->|Route to Models| Models Models -->|Classifications & Scores| LG LG -->|Store Results| VDB LG -->|Store Results| TS VDB --> Analytics TS --> Analytics Analytics -->|Trigger Actions| MCP MCP --> SAP MCP --> Teams MCP --> Ext SAP -.->|Feedback| N8N Teams -.->|Feedback| N8N Ext -.->|Context| Analytics classDef sensorStyle fill:#3b82f6,stroke:#1e40af,color:#fff,stroke-width:3px classDef orchestrationStyle fill:#8b5cf6,stroke:#6d28d9,color:#fff,stroke-width:3px classDef classifierStyle fill:#10b981,stroke:#059669,color:#fff,stroke-width:3px classDef dataStyle fill:#f59e0b,stroke:#d97706,color:#fff,stroke-width:3px classDef analyticsStyle fill:#ec4899,stroke:#be185d,color:#fff,stroke-width:3px classDef mcpStyle fill:#ef4444,stroke:#dc2626,color:#fff,stroke-width:3px classDef enterpriseStyle fill:#64748b,stroke:#475569,color:#fff,stroke-width:3px class S sensorStyle class N8N orchestrationStyle class Models,LG classifierStyle class VDB,TS dataStyle class AN analyticsStyle class MCP mcpStyle class SAP,Teams,Ext enterpriseStyle
Optimized system architecture: Sensors → Orchestrator → ML Classification → Agent Coordination → Data Storage → Analytics → Enterprise Actions

Orchestration Layer

n8n Orchestrator
Primary event orchestrator receiving raw sensor data, routing to appropriate classifiers, managing data flow, and generating system-wide events with comprehensive logging
ML Classifier Stack
8 machine learning algorithms (Random Forest, SVM, KNN, MLP, Decision Trees, etc.) performing classification, prediction, and scoring on routed sensor data
LangGraph Agent
Post-classification agent coordination handling decision logic, result aggregation, and intelligent routing of classified data to storage and analytics systems
Model Context Protocol (MCP)
Standardized interface layer for action generation, connecting analytics results to enterprise platforms with secure bidirectional communication

Machine Learning & AI Stack

Random Forest Classifier Ensemble learning for gas classification with 99.58% accuracy, handling multiclass prediction across 56 drift-based features
Support Vector Machines High-dimensional pattern recognition with RBF kernel for complex decision boundaries in sensor response space
K-Nearest Neighbors Instance-based learning for real-time classification with optimized k-value selection
Multi-Layer Perceptron Neural network architecture for non-linear pattern detection in temporal sensor data
Decision Trees Interpretable rule-based classification for transparent decision-making processes
Gradient Boosting Methods Sequential ensemble learning for incremental model improvement and error correction

Data Infrastructure

Vector Database
Purpose: Semantic search and pattern storage
Storage: Classified results, sensor patterns, historical signatures
Queries: Nearest neighbor searches for anomaly detection
Source: LangGraph agent stores classification results post-ML processing
Time-Series Database
Purpose: Temporal evolution and drift tracking
Storage: Classification scores, confidence levels, drift progression
Analysis: Temporal patterns, statistical aggregations, trend detection
Source: LangGraph agent stores timestamped classification outputs

Enterprise System Connections

1
SAP ERP Integration
Via MCP: Maintenance work orders, asset management, inventory tracking
Data Flow: Leak events trigger automatic maintenance requests
Sync: Sensor calibration records, replacement part orders
2
Microsoft Teams Integration
Via MCP: Two-way communication with operations and safety personnel
Interaction: Real-time alerts, status queries, command acknowledgment, collaborative response
Rich Content: Embedded sensor readings, classification confidence, location data, action recommendations
3
External Data Sources
Via MCP: Weather data, facility schedules, regulatory databases
Context: Environmental factors affecting sensor performance
Compliance: Automated reporting to regulatory systems

AI-Powered Analytics

Anomaly Detection Unsupervised learning identifying novel sensor patterns not present in training data
Predictive Maintenance Time-series forecasting for sensor drift progression and calibration scheduling
Root Cause Analysis LLM-powered investigation correlating alerts with operational events and environmental factors
Adaptive Thresholds Reinforcement learning optimizing detection parameters based on operational feedback

Performance Metrics

99.58%
Accuracy
99.58%
Precision
99.58%
Recall
3.78M
Value Score

Model Performance Comparison

Model Accuracy Precision Recall F1-Score Value Score
Random Forest 99.58% 99.58% 99.58% 99.58% 3,780,000
K-Nearest Neighbors 99.32% 99.33% 99.32% 99.32% 3,742,500
Decision Tree 99.17% 99.17% 99.17% 99.17% 3,720,000
Support Vector Machine 98.59% 98.60% 98.59% 98.59% 3,637,500
Quadratic Discriminant Analysis 92.29% 92.50% 92.29% 92.27% 2,730,000

Confusion Matrix - Random Forest

Confusion Matrix for Random Forest Model
Random Forest achieves near-perfect classification across all gas types

ROC Curves - Multiclass Classification

ROC Curves for Random Forest
One-vs-Rest ROC curves demonstrating exceptional discriminative ability

Technical Architecture

Sensor Array

MQ2
LPG, Butane, Methane, Smoke
MQ3
Smoke, Ethanol, Alcohol
MQ5
LPG, Natural Gas
MQ6
LPG, Butane
MQ7
Carbon Monoxide
MQ8
Hydrogen
MQ135
Air Quality, Smoke, Benzene

Feature Engineering

Feature Importance Analysis
Drift characteristics provide robust features for gas classification
56 Total Features 7 sensors × 8 drift characteristics per sensor
ΔR Analysis Difference between maximum resistance and baseline
|ΔR| Ratio Normalized resistance ratio (max/baseline)
EMA Tracking Exponential Moving Averages with 3 alpha values

Sensor Response Distributions

MQ2 Sensor Distribution
MQ2 sensor response patterns across different gas exposures

System Workflow

Training Phase

1
Data Collection
Load raw sensor data from 7 MQ sensors with 4 gas class labels
2
Feature Extraction
Calculate drift characteristics using temporal windows (56 features total)
3
Model Training
Train 8 different ML models with hyperparameter optimization
4
Value Evaluation
Score models using: Value = VTP×TP + VTN×TN + VFP×FP + VFN×FN
5
Model Selection
Deploy best performing model (Random Forest: 99.58% accuracy)

Real-Time Prediction

1
Sensor Reading
Receive raw measurements from all 7 MQ sensors
2
Drift Calculation
Compute temporal drift characteristics from sensor history
3
Gas Classification
Predict gas type with confidence probabilities
4
Drift Detection
Identify sensor drift exceeding 15% threshold
5
Leak Detection
Check concentration (>3000 ppm), duration (>30s), and rapid changes (>50%)
6
Alert Generation
Send webhook notifications for confirmed leaks or critical drift

System Flow Diagrams

Modeling Flow Chart
Complete modeling and evaluation pipeline

Implementation Details

Technology Stack

Python scikit-learn Random Forest KNN SVM MLP FastAPI Webhook Integration Real-time Processing

Key Capabilities

Multi-Model Evaluation 8 different ML algorithms tested and optimized for best performance
Value-Based Selection Custom scoring function weighing false negatives heavily (VFN = -5000)
Temporal Analysis Window-based drift detection using exponential moving averages
Configurable Thresholds Adjustable concentration, duration, and change rate parameters
Independent Detection Separate gas classification and leak detection pipelines
Real-Time Alerts Instant webhook notifications to external systems

Detection Criteria

Concentration Threshold
Total gas concentration exceeding 3000 ppm triggers potential leak status
Duration Requirement
Leak must persist for at least 30 seconds to be confirmed as critical
Rapid Change Detection
Sudden concentration variations exceeding 50% indicate emergency conditions
Drift Monitoring
Sensor drift above 15% triggers maintenance alerts and calibration requirements

Data Sources & Validation

Training Data

Gas Sensor Array Drift Dataset
Source: UCI Machine Learning Repository
Purpose: Drift characteristics methodology
Gases: Ethanol, Ethylene, Ammonia, Acetaldehyde, Acetone, Toluene
Application: Temporal drift analysis framework
MultimodalGasData Dataset
Source: Mendeley Data
Purpose: Real-world MQ sensor training data
Classes: NoGas, Perfume, Smoke, Mixture
Sensors: MQ2, MQ3, MQ5, MQ6, MQ7, MQ8, MQ135
Dataset Limitations

The MQ2 sensor can detect methane physically, but the training dataset does not include a specific "Methane" class. If methane is present, the system will classify it as "Smoke" or "Mixture" based on sensor response patterns. The leak detection system operates independently and will identify dangerous concentrations regardless of gas type classification.

Ready to Deploy Industrial-Grade Gas Detection

Advanced machine learning delivering 99.58% accuracy with enterprise-ready integration