Weather forecasting has always been a fascinating application of machine learning, and with ML.NET, we can build powerful prediction models right within the .NET ecosystem. In this comprehensive guide, we'll walk through creating a weather prediction model that can forecast temperature, humidity, and precipitation based on historical weather data.
What is ML.NET?
ML.NET is Microsoft's open-source machine learning framework designed specifically for .NET developers. It allows you to build, train, and deploy machine learning models using C# or F#, making it accessible to developers already familiar with the .NET ecosystem.
Setting Up the Project
First, let's create a new console application and install the necessary NuGet packages:
dotnet new console -n WeatherPredictionML
cd WeatherPredictionML
dotnet add package Microsoft.ML
dotnet add package Microsoft.ML.FastTree
Data Model Definition
We'll start by defining our data models. Create a new file called WeatherData.cs
:
using Microsoft.ML.Data;
public class WeatherData
{
[LoadColumn(0)]
public float Temperature { get; set; }
[LoadColumn(1)]
public float Humidity { get; set; }
[LoadColumn(2)]
public float Pressure { get; set; }
[LoadColumn(3)]
public float WindSpeed { get; set; }
[LoadColumn(4)]
public float Visibility { get; set; }
[LoadColumn(5)]
public string Season { get; set; }
[LoadColumn(6)]
[ColumnName("Label")]
public float NextDayTemperature { get; set; }
}
public class WeatherPrediction
{
[ColumnName("Score")]
public float PredictedTemperature { get; set; }
}
Sample Data Generation
For this example, we'll generate sample weather data. In a real-world scenario, you would load this from a CSV file or database:
using Microsoft.ML;
using System;
using System.Collections.Generic;
using System.Linq;
public static class DataGenerator
{
public static IEnumerable<WeatherData> GenerateSampleData(int count = 1000)
{
var random = new Random(42);
var seasons = new[] { "Spring", "Summer", "Fall", "Winter" };
for (int i = 0; i < count; i++)
{
var season = seasons[i % 4];
var baseTemp = season switch
{
"Spring" => 15f,
"Summer" => 25f,
"Fall" => 10f,
"Winter" => 0f,
_ => 15f
};
var temperature = baseTemp + (float)(random.NextDouble() * 20 - 10);
var humidity = (float)(random.NextDouble() * 100);
var pressure = 1000f + (float)(random.NextDouble() * 50);
var windSpeed = (float)(random.NextDouble() * 30);
var visibility = 5f + (float)(random.NextDouble() * 15);
// Simulate next day temperature with some correlation
var nextDayTemp = temperature + (float)(random.NextDouble() * 6 - 3);
yield return new WeatherData
{
Temperature = temperature,
Humidity = humidity,
Pressure = pressure,
WindSpeed = windSpeed,
Visibility = visibility,
Season = season,
NextDayTemperature = nextDayTemp
};
}
}
}
Building the Prediction Model
Now, let's create the main program that builds and trains our weather prediction model:
using Microsoft.ML;
using Microsoft.ML.Data;
using System;
using System.Linq;
class Program
{
static void Main(string[] args)
{
// Create ML.NET context
var mlContext = new MLContext(seed: 0);
// Generate sample data
var data = DataGenerator.GenerateSampleData(1000).ToArray();
var dataView = mlContext.Data.LoadFromEnumerable(data);
// Split data into training and test sets
var splitData = mlContext.Data.TrainTestSplit(dataView, testFraction: 0.2);
// Define the training pipeline
var pipeline = mlContext.Transforms.Text.FeaturizeText("SeasonFeaturized", nameof(WeatherData.Season))
.Append(mlContext.Transforms.Concatenate("Features",
nameof(WeatherData.Temperature),
nameof(WeatherData.Humidity),
nameof(WeatherData.Pressure),
nameof(WeatherData.WindSpeed),
nameof(WeatherData.Visibility),
"SeasonFeaturized"))
.Append(mlContext.Regression.Trainers.FastTree());
// Train the model
Console.WriteLine("Training the model...");
var model = pipeline.Fit(splitData.TrainSet);
// Evaluate the model
Console.WriteLine("Evaluating the model...");
var predictions = model.Transform(splitData.TestSet);
var metrics = mlContext.Regression.Evaluate(predictions);
Console.WriteLine($"R-Squared: {metrics.RSquared:F4}");
Console.WriteLine($"Root Mean Squared Error: {metrics.RootMeanSquaredError:F4}");
Console.WriteLine($"Mean Absolute Error: {metrics.MeanAbsoluteError:F4}");
// Create prediction engine
var predictionEngine = mlContext.Model.CreatePredictionEngine<WeatherData, WeatherPrediction>(model);
// Make sample predictions
MakeSamplePredictions(predictionEngine);
// Save the model
mlContext.Model.Save(model, dataView.Schema, "weather-prediction-model.zip");
Console.WriteLine("Model saved as weather-prediction-model.zip");
}
static void MakeSamplePredictions(PredictionEngine<WeatherData, WeatherPrediction> predictionEngine)
{
Console.WriteLine("\nSample Predictions:");
Console.WriteLine("==================");
var sampleWeather = new[]
{
new WeatherData
{
Temperature = 20f,
Humidity = 65f,
Pressure = 1015f,
WindSpeed = 10f,
Visibility = 12f,
Season = "Spring"
},
new WeatherData
{
Temperature = 30f,
Humidity = 80f,
Pressure = 1008f,
WindSpeed = 5f,
Visibility = 8f,
Season = "Summer"
},
new WeatherData
{
Temperature = -5f,
Humidity = 45f,
Pressure = 1025f,
WindSpeed = 15f,
Visibility = 15f,
Season = "Winter"
}
};
foreach (var weather in sampleWeather)
{
var prediction = predictionEngine.Predict(weather);
Console.WriteLine($"Current: {weather.Temperature:F1}°C, Season: {weather.Season}");
Console.WriteLine($"Predicted next day: {prediction.PredictedTemperature:F1}°C");
Console.WriteLine();
}
}
}
Understanding the Model Performance
The model evaluation provides several important metrics:
- R-Squared: Indicates how well the model explains the variance in the data (closer to 1 is better)
- Root Mean Squared Error (RMSE): Average prediction error in the same units as your target variable
- Mean Absolute Error (MAE): Average absolute difference between predicted and actual values
Advanced Features and Improvements
To improve your weather prediction model, consider these enhancements:
1. Feature Engineering
// Add time-based features
public class EnhancedWeatherData : WeatherData
{
[LoadColumn(7)]
public int DayOfYear { get; set; }
[LoadColumn(8)]
public float PreviousDayTemperature { get; set; }
[LoadColumn(9)]
public float TemperatureTrend { get; set; }
}
2. Hyperparameter Tuning
// Use AutoML for automatic hyperparameter tuning
var experiment = mlContext.Auto().CreateRegressionExperiment(maxExperimentTimeInSeconds: 60);
var experimentResult = experiment.Execute(splitData.TrainSet, "Label");
var bestRun = experimentResult.BestRun;
3. Multiple Output Predictions
You can extend the model to predict multiple weather parameters simultaneously:
public class MultiWeatherPrediction
{
[ColumnName("TemperatureScore")]
public float PredictedTemperature { get; set; }
[ColumnName("HumidityScore")]
public float PredictedHumidity { get; set; }
[ColumnName("PressureScore")]
public float PredictedPressure { get; set; }
}
Real-World Integration
To use this model in a production environment:
- Data Pipeline: Set up automated data collection from weather APIs
- Model Retraining: Implement scheduled retraining with new data
- API Endpoint: Create a REST API to serve predictions
- Monitoring: Track model performance and data drift
Conclusion
ML.NET provides a powerful and accessible platform for building weather prediction models within the .NET ecosystem. This example demonstrates the core concepts of data preparation, model training, evaluation, and prediction. With real historical weather data and additional feature engineering, you can build highly accurate forecasting models.
The combination of ML.NET's ease of use and the robustness of .NET makes it an excellent choice for production machine learning applications. Whether you're building a simple temperature predictor or a complex multi-parameter weather forecasting system, ML.NET provides the tools you need to succeed.
Resources and Next Steps
Happy coding and may your predictions be accurate! 🌤️