Abstract
Sensors installed on a ship return high quality data that can be used for ship bunker fuel efficiency analysis. However, important information about weather and sea conditions the ship sails through, such as waves, sea currents, and sea water temperature, is often absent from sensor data. This study addresses this issue by fusing sensor data and publicly accessible meteorological data, constructing nine datasets accordingly, and experimenting with widely adopted machine learning (ML) models to quantify the relationship between a ship's fuel consumption rate (ton/day, or ton/h) and its voyage-based factors (sailing speed, draft, trim, weather conditions, and sea conditions). The best dataset found reveals the benefits of fusing sensor data and meteorological data for ship fuel consumption rate quantification. The best ML models found are consistent with our previous studies, including Extremely randomized trees (ET), Gradient Tree Boosting (GB) and XGBoost (XG). Given the best dataset from data fusion, their R2 values over the training set are 0.999 or 1.000, and their R2 values over the test set are all above 0.966. Their fit errors with RMSE values are below 0.75 ton/day, and with MAT below 0.52 ton/day. These promising results are well beyond the requirements of most industry applications for ship fuel efficiency analysis. The applicability of the selected datasets and ML models is also verified in a rolling horizon approach, resulting in a conjecture that a rolling horizon strategy of "5-month training + 1-month test/applicatoin" could work well in practice and sensor data of less than five months could be insufficient to train ML models.