The data generation tool in Visual Studio 2008 Data Edition is a great tool for populating your database with meaningless information to use in your unit tests, but when it comes time to do integration testing it's often important to have your data generation plan recreate a consistent dataset in key tables (like the lookup tables used in foreign keys which are often mirrored as Enums in your C# or VB.Net solutions). Fortunately, the data generation tool includes the sequential data-bound generator. This generator selects records from the specified data source and uses the results to populate your table.
So how do we make use of this? In our database solutions, we include two databases - the actual database we're working on, and a datageneration database. For the tables we need to consistenly populate, we duplicate the schema & table in the datageneration database (minus any indexes/keys/constraints/triggers, etc.), and then use the post-deployment script for that database to create the desired records. To reduce duplication of the populate scripts, the post-deployment script for the real database points to the datagenration populate script by a relative path. This also means that these tables will have the same records in them whether you have just deployed the database, or just run the data generation plan - which makes life easier for everyone on the team.
OK - enough talk - let's show this in action. Here is my solution, with three projects (datageneration database, production database, and unit tests).
And here you can see that I've duplicated some of the schemas from my production database in the datageneration database.
Here are my post-deployment and populate scripts for the datageneration database.
And of course, I want that data in my database when I actually deploy, so in the production database project we reference that populate script with a relative path.
Now, how do we hook this up in our data generation plan? It's easy - First, create a data connection to the datageneration database in your server explorer. In our office, we have a local instance of SQL server on each developer workstation, so we make sure to use either . or (local) for the server name so that the datageneration plan works on all workstations.
Now, in the datageneration plan, select the table you want to fill, change the column or columns you need filled to the sequential databound generator, and set the connection and select query properties (be sure not to reference the database in the select query - just the schema and table - the connection specifies the database). The only thing to watch for is the number of rows to generate - don't select more than you've put in your populate script.