A neural network is normally associated with Deep Learning problems, such as Image classification or Natural Language Processing. But it can as easily be trained to build a Linear Regression model. In fact, building a Linear Regression model is a good starting point for learning how to build neural networks.
Here I'll be giving a demo of using TensorFlow for building a Linear Regression model. I have used the Boston House prices dataset available on Kaggle. The full code can be found here.
This article is focused on the coding aspect of building a Neural Network. It assumes a basic understanding of Neural Network architecture.
Data Extraction and Exploration
I loaded the dataset into a pandas dataframe. The dataset consists of 12 numerical features, one boolean feature, and target variable ‘MEDV’, which basically gives the median value of owner-occupied homes in Boston in $1000s. So we need to build a model, that given the above features, is able to predict the price of a house.
Here are the data preprocessing steps I used. I am not going into details of it as I want to focus more on the Neural Network part.
# Data Extractioncolumn_names=['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']df = pd.read_csv('/kaggle/input/boston-house-prices/housing.csv',names=column_names, header=None, delim_whitespace=True)X = df.drop('MEDV',axis=1)
y = df['MEDV']# Data Explorationdf['CRIM'].plot.hist(bins=20);
df['ZN'].plot.hist();
df['AGE'].plot.hist();
df['MEDV'].plot.hist();
df['MEDV'].mean()
df['RAD'].unique()
df['CHAS'].unique()
Data Transformation
I used the pandas get_dummies method to one-hot encode the ‘CHAS’ column as it had 0 and 1 boolean values for whether the property is river-facing or not:
df = pd.get_dummies(df, columns=['CHAS'],drop_first=True)
Then the usual division of the data into test and train set and then further division of training data into train and validation set.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2)
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.1)
For a neural network, standardization or scaling of the numerical data helps greatly in the performance of the model.
I used sklearn’s StandardScaler to scale the column values
from sklearn.preprocessing import StandardScaler
from sklearn.compose import make_column_transformer
ct = make_column_transformer(
(StandardScaler(), ['CRIM', 'ZN', 'INDUS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT']),
remainder='passthrough'
)
ct.fit(X_train)ct.transform(X_train)
ct.transform(X_test)
ct.transform(X_valid);
Note that I have sklearn’s make_column_transformer method which makes using a scaling or any other transformation method on multiple columns a breeze. Also, note that the transformation was first fitted on the training data and then used to transform all the data. This is important to avoid data leakage.
Building a Tensorflow Sequential Model
So now we are ready to build our first Neural Network using Tensorflow. It is really simple and involves a few simple lines of code:
import tensorflow as tfmodel = tf.keras.Sequential([
tf.keras.layers.Dense(10,activation='relu'),
tf.keras.layers.Dense(10,activation='relu'),
tf.keras.layers.Dense(1, activation='linear')
])
I have used 3 layers in this model, 2-hidden and the output layer. The hidden layers have 10 neurons each and the activation used is ‘relu’. The reason I have used ‘relu’ in the hidden layers instead of the default ‘linear’ is that this way the hidden layers would be able to capture some non-linear relationships in the data.
The output layer has only one output neuron as we are trying to predict a single variable’s value. The activation for this layer is ‘linear’.
Now we will compile and fit our model:
model.compile(loss = tf.keras.losses.mean_absolute_error,
optimizer = tf.keras.optimizers.Adam(lr=0.01),
metrics=['mae'])history = model.fit(X_train, y_train, epochs=100, validation_data=(X_valid,y_valid))
During compilation, we define the loss, optimizer and metric to be used for our Linear Regression model.
Once ready, we fit the model to our training data, specify the number of epochs for which we want the model to run over the training data, and also specify the validation data.
Once we run the model, we get something like this:
The final training ‘mae’ is about 4.19 and the validation ‘mae’ is 4.66, which is not bad for a variable whose mean value is about 22.5.
We could this as our base model and play around with some of the hyperparameters such as the number of hidden layers, the number of neurons in the hidden layers, type of activations, learning rate, optimization algorithm to try and get a lower error, which is a standard practice in Deep Learning.
But for our demo purposes, this is enough as we have shown how to build and fit a neural network for Linear Regression using Tensorflow.
We can easily obtain a plot of our results using the ‘history’ variable:
pd.DataFrame(history_1.history).plot(figsize=(6,5))
plt.xlabel("Epochs")
plt.title('Loss curves')
plt.legend();
which gives us a visualisation of how our loss and accuracy are changing per epoch
We can also plot the predicted values along with test values to see how well our model has performed
plt.figure()
plt.scatter(range(len(y_test)),y_test,color='red',label='Actual Values')plt.scatter(range(len(y_pred)),y_pred,color='blue',label='Predicted Values')plt.title('Actual vs Predicted')
plt.legend();
We can see that the model has been able to capture the general distribution of the price variable quite well.
Thanks for reading. Stay tuned for more TensorFlow articles in the near future.
Further Reading :