Using Bogus to create Hierarchical Data Structures

I was recently asked to create an API for retrieving multitenant data. The actual implementation requires a LOT more work than I have time to complete, so I thought I would create an API with some faked data.

Now, at first, I was tempted to grab some data from one tenant and serialize it into a fake service that essentially hijacks the process, returning pseudo-random data. THEN I remembered working with a class library called Bogus. This is a cool library that "generates" random data of all sorts using a fluent interface. You configure the rules using lambdas and then call methods to generate the classes.

This is very straight forward for simple POCOs and works great. However, I was trying to create an object graph, essentially, with hierarchical data. Objects at the second layer need to have a valid reference to objects at the first layer. Third to second, fourth to third. That way, you can query down the chain or get objects at the next-level-lower that have references to the parent. 

The classes for this hierarchy are pretty straight forward - not a lot to them. They essentially have a TenantId, ID, Name and DisplayName along with (for the lower levels) another property for its parent. For this exercise, let's assume the hierarchy is spacial, and we want to model Regions, Sites, Buildings and Floors. This could be any hierarchical model, but this one should be easy enough to visualize.

Since Region is the top of the hierarchy, it won't have a parent ID, but the rest of the objects will. Here's what the Region class may look like:

public class Region
{
public Guid TenantId { get; set; }
public string Id { get; set; }
public string Name { get; set; }
public string DisplayName { get; set; }
}

And setting up the Bogus rules for the Region object is pretty straight forward:

private Faker<Region> CreateRandomizerForRegions(IList<Guid> tenants)
{
var randoRegions = new Faker<Region>()
.RuleFor(t => t.TenantId, f => f.PickRandom(tenants))
.RuleFor(i => i.Id, f => Guid.NewGuid().ToString().ToUpper().Replace("-", "")[..10])
.RuleFor(n => n.Name, f => f.Commerce.Product())
.RuleFor(d => d.DisplayName, f => f.Company.CompanyName());
return randoRegions;
}

For the Id property, I wanted something a little less than a GUID, but something still somewhat random. This could have just as easily been a GUID, but I thought I'd mix it up a little. I realize that it's not truly unique like a GUID should be.

Anyway, in order to create this Faker object, I pass in a collection of GUIDs that represent several tenants to create graphs for.

So, the

  • RuleFor the TenantId property simply chooses a random tenant from that passed-in list
  • RuleFor the Id property does that wacky thing with GUIDs mentioned above
  • RuleFor the Name property chooses a random Product from the Commerce collection (things like Bananas, Pants, etc.)
  • RuleFor the DisplayName property chooses a random CompanyName from the Company collection

Once configured, this method returns the Faker object with the rules for generating properties, specific to the Region object.

This one is very straight forward, so not much more to say here. To see more examples like this, check out the GitHub repository linked above.

The next level down, the Site object, will have an additional property for RegionId so that we know to what region the Site belongs. Since we can't really set the properties in a vacuum (because we need valid Tenant IDs and IDs from the parent Region objects) we need another mechanism for hydrating the TenantId and RegionId properties with valid values created when generating random Region objects.

Fortunately, Bogus has a CustomInstantiator method that will allow us to do just that, but we need a constructor that will take a collection of those randomized Region objects.

public class Site
{
public Site(IList<Region> associatedRegions)
{
var uniqueCombos = associatedRegions
.Select(r => new LinkingObject { TenantId = r.TenantId, Id = r.Id});
// choose one combo randomly
var faker = new Faker();
var randomItem = faker.PickRandom(uniqueCombos);

this.TenantId = randomItem.TenantId;
this.RegionId = randomItem.Id;
}

public Guid TenantId { get; set; }
public string RegionId { get; set; }
public string Id { get; set; }
public string Name { get; set; }
public string DisplayName { get; set; }
}

In addition to the same properties as the Region object, we include a RegionId, and a custom constructor that takes the randomized Region objects created in a previous step. The first step we do is create a collection of Tenant ID and IDs of the random Region objects. Then we pick a random one from the list, and assign those values to the TenantId and RegionId of the Site object. That's all done in the constructor.

Now that we have that setup, we can write the randomizing rules for the Site object:

private Faker<Site> CreateRandomizerForSites(IList<Region> regions)
{
var randoSites = new Faker<Site>()
.CustomInstantiator(f => new Site(regions))
.RuleFor(s => s.Id, f => Guid.NewGuid().ToString().ToUpper().Replace("-", "")[..10])
.RuleFor(s => s.Name, f => f.Commerce.Product())
.RuleFor(s => s.DisplayName, f => f.Company.CompanyName());

return randoSites;
}

Here, we need to pass those randomized Region objects to the method that creates the randomization rules for the Site object, and use the CustomInstantiator method to use those values.

This same pattern gets applied to randomizing the Building and Floor objects as well. At a slightly higher level, you might end up with something like this (where the properties can be queried later):

public Randomizer(IList<Guid> tenants)
{
var regions = this.CreateRandomizerForRegions(tenants);
this.FakeRegions = regions.Generate(this.NumRegions).ToList();
var sites = this.CreateRandomizerForSites(this.FakeRegions);
this.FakeSites = sites.Generate(this.NumSites).ToList();
var buildings = this.CreateRandomizerForBuildings(this.FakeSites);
this.FakeBuildings = buildings.Generate(this.NumBuildings).ToList();
var floors = this.CreateRandomizerForFloors(this.FakeBuildings);
this.FakeFloors = floors.Generate(this.NumFloors).ToList();
}

We create the randomized Region objects, assign those to the Randomizer class property of FakeRegions, and then use those FakeRegions to randomize and associate the Site objects. Then we assign the Site objects to the property and pass those to the Building, and then the same with Building to Floor. The randomizing rules for Building and Floor are very similar to the Site object, and they have a very similar constructor.

The Num{x} properties (NumRegions, NumSites, etc.) allows for creating a variable number of each object. In this case, they are defaulted to 5, 10, 60 and 500 respectively, but they can be overwritten with any non-zero positive value. The ToList ensures that we don't enumerate the collection multiple times since Faker returns enumerables.

From here, these objects could be serialized into a file which is, in turn, read in and deserialized, or they could just as easily be persisted in a database using Entity Framework. For my purposes, I'm just storing them as JSON files, and reading them back in as the constituent objects for use in other code. Not particuarly elegant, but it gets the job done.

I hope you found this useful!

Comments

Popular posts from this blog

Adventures in Getting a Site Hosted

Design Patterns: An Introduction