Speeding Up Laravel's Database Seeders


When you have to frequently run your database seeders, even a short time spent waiting for them to run can quickly add up. This is a short article on how you can quickly and easily speed up your migrations and get back to coding quickly!

I was recently working on a Laravel project for which I needed to create a large number of fake users. To achieve this I used a seeder along with the excellent Faker package to generate the user's information. Here is the basic seeder class that I created:

<?php
use Illuminate\Database\Seeder;
use Faker\Factory as Faker;
use App\Users\User;

class UsersTableSeeder extends Seeder {

    public function run()
    {
        $faker = Faker::create();

        foreach (range(1, 100) as $index)
        {
            User::create([
                'email'      => $faker->email,
                'password'   => bcrypt('secret'),
                'name'       => $faker->firstName . ' ' . $faker->lastName
            ]);
        }
    }
}

When I came to run the seeder on my local homestead server, it took a little over 15 seconds to finish! This may not seem too bad, but alongside other seeders, and when run a number of times, this can quickly become frustrating. Luckily, there are a couple of quick fixes to dramatically speed up the running time.

As you can see above, every user has the password 'secret' that we hash using Laravel's bcrypt() function. The bcrypt hashing function is very secure but one of the reasons for this is due to the amount of time that it requires to create the hash. Given that each user's password is exactly the same, we can calculate the hashed password once and store it for use multiple times. This should result in a significant reduction in computation time.

public function run()
{
    $faker = Faker::create();
    $password = bcrypt('secret');

    foreach (range(1, 100) as $index)
    {
        User::create([
            'email'      => $faker->email,
            'password'   => $password,
            'name'       => $faker->firstName . ' ' . $faker->lastName
        ]);
    }
}

Once we have made these changes, the average time to run the seeder drops from 15 seconds to 6 seconds! Whilst this is much better, it's still far too long.

In the seeder above we are calling User::create() and passing in a fake users information. This will run an SQL query like INSERT INTO users VALUES ('email', 'password', 'name', ...) every time a user is created. In this example we want to create 100 users which means that this query will run 100 times, this is not optimal.

SQL allows us to add multiple items to a table at once using the query: INSERT INTO table_name ( Column1, Column2 ) VALUES ( Value1, Value2 ), ( Value1, Value2 ). If we can work out a way to structure our seeder to run just this one query instead of 100, we should see a significant speed boost.

Rather than using User::create() we will store the information for each user that we wish to create in an array. Once we have an array of all the fake users data, we can call User::insert($users). The insert function will create one big query in which all users are inserted into the users table. The example code is below:

public function run()
{
    $faker = Faker::create();
    $users = [];

    $password = bcrypt('password');

    foreach (range(1, 100) as $index)
    {
        $timestamp = Carbon\Carbon::now();
        $users[] = [
            'email'         => $faker->email,
            'password'      => $password,
            'name'          => $faker->firstName . ' ' . $faker->lastName,
            'created_at'    => $timestamp,
            'updated_at'    => $timestamp
        ];
    }

    User::insert($users);
}

This modification brings down our seed time to a far more manageable 1.6 seconds, over 9 times quicker than what we started with! There are however a couple of quirks to take note of when using this method. The first difference is that timestamps are no longer automatially set for us. You can see in the final example that we have had to set the created_at and updated_at timestamps ourselves, a small price to pay for such a large speed gain. Secondly, SQL has a limit of 1000 values being set per query. If this becomes an issue then you can easily split the users array up into chunks and then pass each chunk to the insert function.

If anyone has any other suggestions or questions about speeding up database migrations, then please let me know in the comments below.

comments powered by Disqus