The aggregation of all the losses / errors in a neural network during one pass of training. Measures the aggregate “correctness” or error of the network. There are many ways to calculate the cost function, but a simple one is the average of the square of all the errors for the different outputs. This is called the mean squared error (MSE). As we train our models we iterate them with each training cycle to lower and lower the total error of the network (total cost). Reducing how long it takes to get to the goal of lowest error is the goal of achieving convergence. See Loss Function.