A weight-averaging approach to speeding up model training on resource-constrained devices