KEY TAKEAWAYS
- A modern fleet AI pipeline moves data through five discrete layers: CAN bus, edge device, MQTT transport, cloud ingestion, and ML inference. Each layer introduces failure points that directly affect prediction accuracy.
- CAN bus generates 2,000 to 10,000 frames per second per vehicle. Edge filtering reduces cloud payload by 80 to 95% before transmission without losing predictive signal.
- MQTT QoS level and broker architecture determine whether fault-code data survives connectivity gaps. The wrong configuration produces silent data loss that corrupts downstream models.
- Cloud ML pipelines for fleet prediction combine time-series anomaly detection with physics-based degradation models. Neither approach alone achieves production-grade fault detection accuracy.
- Intangles’ pipeline processes 450+ real-time signals per vehicle through a Digital Twin layer, achieving 95% fault detection accuracy and surfacing failures 2 to 4 weeks before any DTC triggers.
Most conversations about fleet IoT focus on the dashboard. What alerts fire, what the map looks like, and how driver scores are calculated. The architecture question rarely surfaces until something breaks: a prediction fires too late, a model stops working after a vehicle firmware update, or a new OEM’s CAN signals don’t map to the existing schema.
The pipeline is the product. How data moves from a vehicle sensor to a prediction output determines latency, accuracy, and which failure patterns a system can actually detect. This guide documents each layer of the IoT-to-AI data pipeline used in modern predictive fleet intelligence platforms, with specific engineering considerations for fleet IT managers evaluating telematics infrastructure and engineering teams integrating vehicle data into existing platforms.
The full pipeline at a glance
A fleet AI pipeline is not a single system. It is a chain of five distinct layers, each with its own engineering constraints, failure modes, and vendor choices. Understanding where each layer begins and ends is what separates a genuine evaluation from a feature-comparison exercise.
Each section below covers one layer: what it does, what can go wrong, and what to verify when evaluating a platform.
Layer 01: Vehicle sensors and CAN bus
Every commercial vehicle continuously broadcasts data across several bus architectures. The primary one for fleet telematics is the Controller Area Network (CAN bus), standardised under SAE J1939 for heavy vehicles and ISO 15765-2 (OBD-II) for lighter applications.
J1939 transmits as Parameter Group Numbers (PGNs), standardised message identifiers that encode the signal type. A single J1939 frame carries 8 bytes of data. At 250 kbps bus speed, a vehicle generates between 2,000 and 10,000 frames per second across all active PGNs, covering the engine ECU, transmission, braking, aftertreatment, and body systems simultaneously. Recent research applying deep learning to J1939 CAN bus data has demonstrated reliable exhaust backpressure and power loss prediction from these raw signal streams, confirming the predictive value of continuous PGN capture.
| System | Protocol | Typical signals | Update rate |
| Engine ECU | J1939 | RPM, coolant temp, oil pressure, fuel rate, boost pressure | 10 to 100 ms |
| Transmission | J1939 | Gear position, torque converter slip, fluid temp | 20 to 100 ms |
| Braking (ABS/EBS) | J1939 | Wheel speed per axle, brake demand, axle load | 10 to 20 ms |
| Aftertreatment (DEF/DPF) | J1939 | DEF level, DPF soot load, SCR efficiency, NOx | 100 to 500 ms |
| Body/auxiliary | Proprietary CAN | PTO state, refrigeration unit, tail-lift | Variable |
| IMU | Direct to ECU | 3-axis acceleration, roll pitch | 10 to 50 ms |
| GPS | NMEA 0183/U-blox | Position, heading, altitude HDOP | 1 to 10 Hz |
OBD-II vs J1939: Which one matters for your fleet?
OBD-II is mandatory in light-duty vehicles (under 8,500 lb GVWR in the US) manufactured after 1996. It provides a standardised subset of diagnostic data via a 16-pin port and a defined PID set. J1939 is the heavy-duty equivalent used in Class 6 to 8 trucks, buses, construction equipment, and agricultural machinery. It carries far more signals, supports proprietary extensions, and is the relevant protocol for most commercial fleet applications.
A platform that reads only OBD-II cannot build an accurate predictive model for a Class 8 truck. The signals needed for engine, drivetrain, and aftertreatment prediction simply are not there. This distinction matters when evaluating vendor claims about “full diagnostic access.”
Not all PGNs are accessible across all vehicles, either. OEM proprietary PGNs, which often carry the richest diagnostic data, require OEM licensing agreements or validated reverse-engineering. Intangles’ InGenious hardware includes validated PGN mappings across major commercial truck OEMs (Volvo, Daimler, PACCAR, Navistar, Tata, Ashok Leyland), which eliminates the most common integration failure point at this layer.
Layer 2: Edge device and local processing
What does the edge device actually do?
The telematics edge device sits between the vehicle’s CAN bus and the cellular modem. In a well-architected pipeline, it is not a passive data relay. It performs four functions that directly determine the quality of what reaches the cloud.
- Signal filtering and decimation
Raw CAN traffic at thousands of frames per second cannot be transmitted economically over cellular. The edge device applies configurable filters: transmit only when a signal changes beyond a defined threshold (delta filtering), or at a maximum rate (rate limiting). A coolant temperature that shifts 0.1°C every 100ms does not need to transmit at 10 Hz. Well-implemented edge filtering reduces cloud payload by 80 to 95% without meaningful loss of predictive signal. Fixed-rate decimation with no adaptive logic discards exactly the transient spikes that indicate incipient failures. - Local buffering for connectivity gaps
Vehicles lose cellular coverage. Without local buffering, those gaps corrupt time-series models: a 15-minute tunnel produces a 15-minute anomaly that has nothing to do with vehicle health. Edge devices should buffer at a minimum of 24 to 72 hours of filtered telemetry and sync on reconnection with timestamps intact. The distinction between UTC-stamped local storage and transmission-time sequencing matters because the latter produces ordering errors after reconnection that are nearly impossible to detect downstream. - Edge inference
Some advanced platforms run lightweight ML models directly on the edge device. Simple threshold-based rules are the baseline: trigger an alert if coolant temperature exceeds 105°C. More capable edge inference runs statistical process control models or compressed neural networks to detect local anomalies before transmission, reducing detection latency from hours (cloud round-trip) to seconds. - Signal encoding and compression
A JSON-encoded J1939 frame is 5 to 10 times larger than its binary equivalent (protobuf or a custom binary format). For a 500-vehicle fleet transmitting 100 signals per vehicle at 1 Hz, the difference between JSON and binary encoding is approximately 40 MB vs 6 MB per vehicle per day, which directly affects cellular data costs at scale.
Edge device hardware: What the spec sheet should say
| Parameter | Basic telematics | Predictive AI |
| CPU | 32-bit ARM Cortex-M | 64-bit ARM Cortex-A |
| RAM | 256 KB | 512 MB+ |
| Local storage | 8 MB flash | 8 to 32 GB eMMC |
| CAN interfaces | 1 x J1939 | 2 x J1939 + OBD-II |
| Cellular | Cat-M1/NB-IoT | LTE Cat-4 + Cat-M1 fallback |
| Operating temp | -20°C to +70°C | -40°C to +85°C |
Layer 3: MQTT transport
Why MQTT and not HTTP?
MQTT (Message Queuing Telemetry Transport) is the dominant protocol for vehicle telematics data transport. Its publish/subscribe model means vehicles publish to topics without needing to know which cloud services consume the data, so adding a new ML model that consumes engine temperature data requires no vehicle firmware change. Its persistent session model means the broker queues messages during connectivity gaps and delivers them on reconnection. Its fixed header is 2 bytes versus 200 to 800 bytes for HTTP/1.1, which matters at high-frequency transmission rates. HiveMQ’s 2026 analysis of MQTT in fleet tracking deployments confirms the protocol’s pub/sub architecture as the primary reason telematics platforms select it over HTTP-based alternatives.
QoS levels and their fleet implications
MQTT defines three Quality of Service levels, and the choice has direct consequences for data completeness.
- QoS 0 (fire and forget)
Delivers no acknowledgement. Packet loss equals data loss. Appropriate for high-frequency signals like GPS position at 1 Hz, where a missed reading is immediately superseded. Not appropriate for fault codes or threshold crossings, where a single missed message means a missed alert. - QoS 1 (at least once)
Retransmits until acknowledged. The broker guarantees delivery but may produce duplicates, which must be handled downstream via message ID or timestamp deduplication in the stream processor. Appropriate for most telematics signals since predictive models tolerate occasional duplicates but not gaps. - QoS 2 (exactly once)
Uses a four-part handshake to guarantee each message is delivered exactly once. Highest overhead. Appropriate for billing-sensitive signals (fuel consumption, mileage) and fault code triggers where duplicates would generate false alerts.
A practical configuration for most fleet pipelines: GPS, speed, and heading on QoS 0; engine temperatures, pressures, and driver behaviour events on QoS 1; fault codes and compliance-sensitive signals on QoS 2.
Broker architecture
For fleets above 100 vehicles, a single MQTT broker instance is a reliability risk. Production platforms use clustered broker architecture (HiveMQ, EMQX, or AWS IoT Core) with active-active clustering across at least two availability zones, persistent session storage in an external database rather than broker-local storage, and TLS 1.2+ mutual authentication with per-device certificates issued at manufacturing time, not shared credentials.
Layer 4: Cloud ingestion and stream processing
When messages reach the broker, three parallel consumers typically operate in a mature fleet data pipeline.
The first is a raw data lake write: every message is written to cold storage (S3, Azure Data Lake, or GCS) in its original encoded format. This is the audit trail and the training data source for future ML models. Partitioned by vehicle ID and date, compressed with Snappy or Zstd, retained for 2 to 5 years depending on regulatory requirements.
The second is a stream processor: a real-time engine (Apache Kafka + Flink, AWS Kinesis + Lambda, or Google Dataflow) that performs schema validation, unit normalisation from raw CAN values to engineering units (°C, kPa, L/h), signal enrichment, windowed aggregations, threshold alerting, and writes to the time-series database for ML feature serving.
The third is a time-series database. General-purpose relational databases become bottlenecks at fleet scale. 500 vehicles transmitting 100 signals at 1 Hz generate 50,000 writes per second, sustained. Purpose-built time-series stores (InfluxDB, TimescaleDB, Apache Druid) handle this write-heavy pattern efficiently. The schema choices made here, including tag cardinality, downsampling policy, and retention windows, directly affect query performance for ML feature pipelines years later.
Latency targets that matter
End-to-end from sensor event to alert delivery should be under 10 seconds in a well-architected pipeline. Platforms that batch-process hourly have a fundamentally different architecture suited to reporting, not real-time safety or imminent-failure alerting. The most meaningful latency to verify during evaluation is not the marketing claim: it is the measured time from a simulated fault event to alert delivery in your specific network conditions.
| Stage | Target latency | Impact if missed |
| Edge device to broker | 50 to 500 ms (LTE) | Low |
| Broker to stream processor | Under 100 ms | Low |
| Stream processor to TSDB | Under 500 ms | Low |
| TSDB to ML feature serving | Under 1 s | Medium |
| ML inference to alert delivery | Under 5 s | High |
Related article: Real-Time Vehicle Tracking: How IoT enables instant Fleet Visibility
Layer 5: ML inference and prediction output
What kinds of models run on fleet data?
Production fleet prediction does not run a single model type. The signals are heterogeneous: continuous sensor streams, discrete fault events, and operational parameters that modulate degradation rates at the same time. Effective platforms combine two model families.
- Time-series anomaly detection
Learn the normal operating envelope for each signal on each vehicle, not fleet averages, but per-vehicle baselines that account for age, specification, and duty cycle. Common approaches include ARIMA and SARIMA for capturing daily duty cycle seasonality, LSTM or Transformer-based sequence models for multivariate temporal dependencies, and Isolation Forest or One-Class SVM for detecting unusual signal combinations where individual readings look normal but the combination does not. The output of anomaly detection is a continuous anomaly score, not a prediction. It feeds the prediction layer. - Physics-based degradation models
Address the core limitation of anomaly detection alone: high false-positive rates in variable operating conditions. A coolant temperature spike during a mountain grade at full load is not an anomaly. It is expected behaviour. A physics-based model incorporates component thermal limits from OEM service data, degradation physics (Arrhenius equation for thermal degradation, Paris Law for fatigue crack growth), and operating condition normalisation as covariates. The output is a Remaining Useful Life (RUL) estimate with a confidence interval.
How both approaches combine
Neither model family achieves production-grade accuracy alone. The effective architecture fuses both: anomaly scores from the ML layer are combined with physics model outputs through a Bayesian update or ensemble fusion layer. The result is a failure probability score calibrated to real operating conditions, not just statistical deviation from fleet averages. Intangles’ predictive health monitoring implements this fusion architecture, achieving 95% fault detection accuracy across engine, drivetrain, aftertreatment, and braking systems.
What prediction output looks like in practice
The prediction layer produces three output types. A failure probability score: a per-component probability of failure within a defined horizon (typically 30 days), updated every 5 to 15 minutes in production. An RUL estimate: a point estimate and confidence interval for time-to-failure, used for parts pre-ordering and workshop scheduling. And an actionable alert: a discrete event triggered when probability exceeds a threshold, including component, predicted failure mode, confidence level, recommended action, and estimated downtime cost if unaddressed. Alerts are delivered via dashboard, email, SMS, or webhook to maintenance management systems.
The digital twin layer
A digital twin is a continuously-updated virtual model of a specific physical vehicle, not a generic model of that vehicle type. It is distinguished from a fleet average or population model by per-unit calibration: the twin’s parameters are fitted to the operational history of that specific asset.
In a fleet data pipeline, the twin is a persistent data structure that aggregates the vehicle’s current sensor state, historical signal trajectories, component degradation estimates, model calibration parameters, and contextual metadata, including specification, age, accumulated mileage, and maintenance history. Together, these enable queries that a signal-only system cannot answer.
Can this vehicle complete a 600 km run given its current brake pad estimate and the forecast ambient temperatures on the route? If we delay the oil change by two weeks, what does the predicted failure probability distribution look like at that point? Which 10 vehicles in the fleet have the highest 30-day failure probability, ranked by expected downtime cost?
These are planning queries, not monitoring queries. They require a model of the vehicle, not just a stream of readings. The twin is updated on each inference cycle: the state estimator refreshes component degradation estimates via Kalman filter or particle filter, model re-calibration runs when residuals exceed threshold, prediction re-runs with updated state, and the updated predictions surface to the dashboard and maintenance scheduler. Intangles’ Digital Twin technology is the core layer that makes this per-vehicle planning capability possible.
Where pipelines fail in production
Evaluating a telematics platform on features misses the more important question: where does this system fail, and how does it recover? These are the most common production failure patterns.
- Schema drift: A vehicle firmware update changes the encoding of a J1939 PGN. The edge device sends raw values that the cloud normalisation layer misinterprets as engineering units. Engine coolant temperature reads as -40°C for an entire fleet. The failure is silent: no error is thrown, and the dashboard just shows wrong data. The fix is a versioned signal schema registry with validation at the stream processor and alerting on schema validation failure rate spikes.
- Clock skew: The vehicle’s real-time clock drifts without GPS correction during tunnels or underground operations. Timestamps on buffered data are wrong. Time-series models interpret the reconnection burst as a sudden multi-hour anomaly. GPS-disciplined timestamps, NTP correction at the edge device, and storing both device-local and server-reception timestamps prevent this from corrupting model outputs.
- Edge buffer overflow: A vehicle is offline longer than the edge buffer retention window, which is common in remote mining or agriculture operations. Buffer overflow discards the oldest data. The gap appears in the time-series record and corrupts rolling-window features for ML models. Sizing buffers for worst-case offline duration and flagging vehicles in a “data gap” state, rather than treating the reconnection point as continuous data, prevents this from generating false anomaly signals.
- Population model generalisation: A model trained on fleet-wide data makes predictions that are accurate on average but wrong for edge cases: old vehicles, non-standard specifications, unusual duty cycles. Segmenting training populations by vehicle specification, age, and application, combined with per-vehicle fine-tuning, is the only reliable mitigation. Monitor per-vehicle prediction residuals to detect generalisation failures before they cause missed or false alerts.
- Alert fatigue: A model with 70% precision at 100 alerts per month produces 30 false alerts. Technicians learn to ignore the system. True positives are missed. Tuning alert thresholds for precision over recall in initial deployment, then increasing recall as operator trust develops, prevents this. Track alert-to-confirmed-fault conversion rate as a first-class KPI.
Integration checklist for fleet IT
Use this checklist when evaluating a fleet IoT-AI platform or integrating an existing platform into your data infrastructure.
Data ingestion
| Check | What to verify |
| J1939 PGN coverage | Confirmed coverage for your specific OEM/model combinations, not generic J1939 support |
| Edge device operating temp | Rated for your deployment environment (-40°C to +85°C for most commercial vehicle applications) |
| Buffer capacity | Sized for worst-case offline duration, minimum 72 hours recommended |
| Encoding protocol | Binary (protobuf or equivalent) with documented SDK for custom integration |
Transport
| Check | What to verify |
| Broker architecture | Clustered, multi-AZ, external session storage (not broker-local) |
| Authentication | Per-device TLS certificates, not shared credentials |
| QoS configuration | Per-signal QoS levels matching data criticality |
| Cellular fallback | LTE to Cat-M1 to satellite path tested in your operating region |
Cloud ingestion
| Check | What to verify |
| Raw data retention | Policy meets your regulatory and audit requirements |
| Alert latency | End-to-end from sensor event to alert delivery, target under 10 seconds |
| TSDB schema | Reviewed for your reporting and ML query patterns |
| Downsampling policy | Documented retention windows for raw, hourly, and daily resolution |
ML pipeline
| Check | What to verify |
| Model architecture | Hybrid (anomaly detection + physics) vs rule-based only |
| Per-vehicle calibration | Per-vehicle model vs population model only |
| Accuracy benchmarks | Precision and recall on your vehicle population, not vendor benchmark fleet |
| False positive rate | At the production alert threshold, not the theoretical optimum |
Integration
| Check | What to verify |
| CMMS integration | Webhook or API for alert delivery to your maintenance management system |
| Data export | Format and frequency for your data warehouse or BI platform |
| Access control | RBAC model matches your IT security requirements |
| Security certification | SOC 2 Type II or equivalent |
The value of fleet IoT is not in collecting more data. It is in building a reliable pipeline that transforms raw vehicle signals into accurate, actionable predictions. Building an effective fleet AI system requires more than collecting vehicle data. The entire pipeline, from CAN bus acquisition and edge processing to cloud ingestion, machine learning, and digital twins, determines whether raw telemetry becomes reliable predictive intelligence.
As connected fleets continue to generate larger volumes of operational data, the challenge is no longer visibility. The real opportunity lies in transforming that data into early warnings, maintenance recommendations, and actionable insights that help prevent failures before they impact operations.
Intangles addresses this challenge through an AI-powered Digital Twin platform that continuously analyzes vehicle behavior, tracks component health, and identifies emerging issues weeks before conventional diagnostics or fault codes appear.
Explore how Intangles transforms vehicle telemetry into predictive maintenance intelligence that helps fleets reduce unplanned downtime, improve reliability, and make more informed maintenance decisions.
KNOW MORE
Frequently Asked Questions
What is the difference between J1939 and OBD-II for fleet telematics?
J1939 is the heavy-duty vehicle standard used in Class 6 to 8 trucks, buses, and off-highway equipment. It carries significantly more diagnostic signals than OBD-II, which covers light-duty vehicles under 8,500 lb GVWR. A platform that reads only OBD-II cannot build accurate predictive models for commercial truck fleets. Most mixed fleets require support for both protocols.
How much data does a vehicle generate per day in a typical fleet deployment?
After edge filtering, a well-configured telematics unit transmits 5 to 10 MB per vehicle per day over cellular. Raw CAN bus data before filtering is 500 MB to 2 GB per vehicle per day. The difference is achieved through delta filtering, rate limiting, and binary encoding at the edge device, and it directly affects cellular data costs at fleet scale.
What MQTT QoS level should fleet telematics use?
QoS 1 (at least once) is appropriate for most sensor signals. Predictive models handle occasional duplicates, and deduplication is managed in the stream processor. QoS 2 (exactly once) is appropriate for fault codes and billing-sensitive signals. QoS 0 is acceptable for high-frequency GPS updates where each reading supersedes the previous one.
Can fleet ML models work without historical data for a new vehicle?
Cold-start is a real problem. Physics-based models can generate predictions from day one using OEM component specifications and operating condition data. Pure ML models typically need 4 to 12 weeks of operational data before predictions become reliable. Hybrid pipelines reduce the cold-start period to approximately 2 to 4 weeks by initialising the ML component with physics-model priors.
What does a Digital Twin add over a standard telematics dashboard?
A dashboard shows the current state. A digital twin maintains a persistent, calibrated model of each vehicle that supports predictive and planning queries: estimated time to failure for each component, what-if scenarios for maintenance scheduling, and fleet-wide risk ranking by predicted downtime cost. The twin is updated on each inference cycle and retains the full operational history in a model form, not just raw signal storage.
How does edge computing reduce cellular data costs in fleet telematics?
Edge filtering reduces raw CAN output by 80 to 95% before it reaches the cellular modem. For a 500-vehicle fleet, this represents approximately $15,000 to $40,000 per year in cellular data savings at standard commercial rates, depending on signal configuration and coverage area.
We’re looking forward to meeting you