Sunday, August 2, 2009

More on Fluent Interface Pattern for Composing Entity Framework Queries

Recap of Last Week

In last week's post I went over an Entity Framework pattern that I’ve been using that’s saved me quite a bit of time in my Repository classes.  The basic idea is that most Entity Framework queries can be broken down into 3 pieces as shown in the diagram below.

image The interesting part emerges when we realize that an EF query is just an object of type ObjectQuery, and that we can create methods that take an ObjectQuery as a parameter, do their processing, and then pass back a return value of type ObjectQuery. This means that we can chain the methods together, fluent interface style, to create queries that look like this:

List<BlogPost> postSet = GetBlogPostSet().SortBy(“Date”).GetPage(pageIndex, pageSize);

List<BlogPost> postSet = GetBlogPostSet().SortBy(“ID”).GetPage(pageIndex, pageSize);

List<BlogPost> postSet = GetBlogPostSet().SortBy(“ID”).GetAll();

Shaping Data With Projection Methods

This week I want to take a closer look at the Projection Method and how we can use it to shape our return data.  So let’s start out by looking at a pretty standard query that returns a list of BlogPost entities.

List<BlogPost> posts = GetAllBlogPosts().SortBy(sortOption).GetList();

This query contains each of the three components described above, a Filter method (GetAllBlogPosts), a Sort method (SortBy), and a Projection method (GetList). The code for each of the methods is listed below.  Note that our Projection Method, GetList(), is just running toList() on the query and returning the resulting List<BlogPost>.

// GetBlogPostSetByDate

private ObjectQuery<BlogPost> GetAllBlogPosts()

{

    var query = from p in Context.BlogPostSet.Include("Categories")

                select p;

    return (ObjectQuery<BlogPost>)query;

}

// SortBy

internal static ObjectQuery<BlogPost> SortBy( this ObjectQuery<BlogPost> query, Enums.BlogPostSortOption sortOption)

{

    switch (sortOption)

    {

        case Enums.BlogPostSortOption.ByDate:

            return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);

        case Enums.BlogPostSortOption.BySite:

            return (ObjectQuery<BlogPost>)query.OrderBy(p => p.BlogProfile.BlogName);

        case Enums.BlogPostSortOption.ByVote:

            return query;

        default:

            return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);

    }

}

// GetList

internal static List<BlogPost> GetList(this ObjectQuery<BlogPost> query)

{

    return query.ToList<BlogPost>();

}

Wonderful, we can mix an match methods like this and they work great as long as we want to return BlogPost entities. But what if we just want a list of the BlogNames?  How do we handle that? It seems like a waste to pull a list of full BlogPost entity objects when all we really need is the BlogName.  In this case it would be nice to shape my return data into a simple list of BlogNames.

I typically handle this in my Projection Methods.  I already have a Filter Method that’s applying the appropriate filter, I have a Sort method that’s applying the appropriate sort, now I need a Projection Method that projects my results onto the appropriate data shape.  So, I need to create a new projection method to do this called GetBlogNames().  My new data access code and the new GetBlogNames() method look like this.

List<String> blogNames = GetAllBlogPosts().SortBy(sortOption).GetBlogNames();

 

// GetBlogNames

internal static List<String> GetBlogNames(this ObjectQuery<BlogPost> query)

{

    var newQuery = from p in query

                   select p.BlogProfile.BlogName;  

    return newQuery.Distinct().ToList<String>();

}

Look closely and you’ll see that the GetBlogNames() method does something kind of interesting.  It does a second select on my query.  We did the first select back in our Filter Method GetAllBlogPosts().  That select returned full BlogPost entities.  Now in my projection method I decide that I only need the BlogNames property so I do a second select that returns only the BlogNames. 

At first glance it may seem that this is still doing something really inefficient.  We have an initial query that pulls full entity objects, then we do a second select against that result set and it get’s the BlogNames.  We’re still pulling full entities in the first query, aren’t we?

No we’re not. Thanks to delayed execution our query never does pull full entities.  This goes back to the big realization that the query is now an object.  I’m a little fuzzy on some of the details here, but it goes something like this.  When we create an Entity Framework query, we ‘re creating an ObjectQuery that contains a Command Tree.  That’s important.  The ObjectQuery contains a Command Tree that represents each of the commands (like select, order by, from, etc.) in our query.  When we need the results of the ObjectQuery, like when we call ToList() or when we iterate over the results, our Command Tree gets translated into TSQL (or whatever is used by the provider we’re querying) and the query is executed. 

So, right up to the point that I call ToList(), my query is just a data structure containing all of my commands, and adding another select is just adding another command to the Command Tree.  At query time, all of my commands will be translated into SQL and executed as a single query that returns just BlogNames.

A Better Data Shaping Example

We’ve seen how we can use a select in our projection method to shape data.  Now let’s apply that principle to a more practical, real world, example, drop down list choices.  How many times have you seen code that pulled down a list of full entity objects, or a fully populated DataTable, just to populate the choices of a drop down list.  This scenario comes up all the time so I’ve added a ListItemDTO Data Transfer Object to my BLL (for more on DTOs see What is the difference between a DTO and a POCO). My ListItemDTO class is just a simple data container for a list item. 

public class ListItemDTO

{

    public String Text { get; set; }

    public String Value { get; set; }

}

Now, for this example let’s assume that in my UI, I have a page that displays a grid of all blog posts, for all blogs, that were posted in a given week.  Now I want to add a DropDownList to the top of this page that allows me to pick a blog and then the grid will display only the posts from that blog.  Simple enough scenario.  I just need to create a DropDownList that contains ListItems with Text equal to BlogName,  and values equal to BlogId.  To get the data to populate this ddl, I need a method in my Repository that gets the list of BlogPosts for the given week, but then shapes the return value as a distinct list of ListItemDTOs where the Text property is BlogName and the Value property is BlogId. 

For us this is no problem.  We already have the Filter and Sort methods needed to get the data, we just need a Projection method that will shape it as a List<ListItemDTO>. 

// GetBlogNameListItemDTOs

internal static List<ListItemDTO> GetBlogNameListItemDTOs(this ObjectQuery<BlogPost> query)

{

    List<ListItemDTO> items = new List<ListItemDTO>();

    var newQuery = from p in query

                   select new {p.BlogProfile.BlogName, p.BlogProfile.BlogId};

    foreach (var item in newQuery.Distinct())

    {

        items.Add( new ListItemDTO{Text=item.BlogName, Value=item.BlogId.ToString()});

    }

    return items;

}

In this Projection method, we added a new select that modified our query to return only the BlogName and the BlogId.  We then used a foreach loop to iterate over the results of the query, create a new ListItemDTO for each item, and then return the List<ListItemDTO>.  There are two details to make note of.  First we modified our query to return it’s data as an an anonymous type with two properties, BlogName and BlogId.  That’s the reason for the “new {p.BlogProfile.BlogName, p.BlogProfile.BlogId}” object syntax.  Second, when we iterate over the query in the foreach loop, we call the Distinct() method which, wait for it, makes sure our results are distinct. 

So, that was pretty painless. We can now write a data access method that leverages our existing Filter and Sort methods, and uses our new GetBlogNameListItemDTOs() to project the results onto a List<ListItemDTO>.  The code is given below.

// BlogNameListItemDtosForWeek

public List<ListItemDTO> BlogNameListItemDTOsForWeek(DateTime startDate)

{

  return GetBlogPostSetForWeek(startDate)

         .SortBy(Enums.BlogPostSortOption.BySite)

         .GetBlogNameListItemDTOs();

}

Conclusion

Hopefully this illustrates a little more of the potential of this pattern.  Now that I’ve been using it for a while I can’t bring myself to structure my Entity Framework Repositories any other way.  There are always situations where the model doesn’t quite work for me, and when that happens I just go back to writing Entity Framework queries from scratch.  But, so far I’ve been pleasantly surprised by how often my data access methods fit neatly into this little time saving model.  If you have suggestions on how to improve the pattern, or other data access patterns that have saved you time, please leave a comment or drop me an email.

For anyone who wants to see the Filter/Sort/Projection model in the context of a repository class, I’m including a partial listing of my BlogPostRepository below.  You’ll see everything we’ve covered over the last two posts, plus there’s some additional code that handles things like caching.

 

public class BlogPostRepository : RepositoryBase

{

 

    // Constructors

    public BlogPostRepository() { }

    public BlogPostRepository(RAEntities context)

    {

        this.Context = context;

    }

 

 

    //*********************************************************************************

    // Data Access Methods

    //******************************************************************************** 

 

    // BlogPostSetForWeek_GetPage

    public List<BlogPost> BlogPostSetForWeek_GetPage(int pageIndex, int pageSize, DateTime startDate, Enums.BlogPostSortOption sortOption, CacheOptions cacheOptions)

    {

        startDate = ToSunday(startDate);

        CacheHelper cache = new CacheHelper();

        string cacheKey = String.Format("BlogPostSetByWeek_GetPage_{0}_{1}_{2}_{3}", pageIndex.ToString(), pageSize.ToString(), startDate.ToShortDateString(), sortOption.ToString());

        // If the cache contains the data, and the user wants us to use

        // cache, then return the cached data

        if (cache.Contains(cacheKey) && cacheOptions.UseCache)

        {

            return (List<BlogPost>)cache.Get(cacheKey);

        }

        // If we fell through to this point then we need to pull data.

        List<BlogPost> postSet = GetBlogPostSetForWeek(startDate)

                                 .SortBy(sortOption)

                                 .GetPage(pageIndex, pageSize);

        if (cacheOptions.UseCache)

        {

            cache.Insert(postSet, cacheKey);

        }

        return postSet;

    }

 

 

    // BlogNamesForWeek

    public List<String> BlogNamesForWeek(DateTime startDate)

    {

        return GetBlogPostSetForWeek(startDate)

               .SortBy(Enums.BlogPostSortOption.BySite)

               .GetBlogNames();

    }

 

 

    // BlogNameListItemDtosForWeek

    public List<ListItemDTO> BlogNameListItemDTOsForWeek(DateTime startDate)

    {

        return GetBlogPostSetForWeek(startDate)

               .SortBy(Enums.BlogPostSortOption.BySite)

               .GetBlogNameListItemDTOs();

    }

 

 

    //*********************************************************************************

    // Filter Methods

    //*********************************************************************************   

    // GetBlogPostSetByDate

    private ObjectQuery<BlogPost> GetAllBlogPosts()

    {

        var query = from p in Context.BlogPostSet.Include("Categories").Include("BlogProfile")

                    select p;

        return (ObjectQuery<BlogPost>)query;

    }

 

    // GetBlogPostSetForWeek

    private ObjectQuery<BlogPost> GetBlogPostSetForWeek(DateTime startDate)

    {

        startDate = ToSunday(startDate);

        DateTime startUtc = startDate.Date;

        DateTime endUtc = startDate.AddDays(7).Date;

        var query = from p in Context.BlogPostSet.Include("Categories").Include("BlogProfile")

                    where p.PostedUtc > startUtc & p.PostedUtc < endUtc

                    select p;

        return (ObjectQuery<BlogPost>)query;

    }

 

 

 

 

    //*********************************************************************************

    // Utility Methods

    //*********************************************************************************

 

    public static DateTime ToSunday(DateTime date)

    {

        double offset = date.DayOfWeek == DayOfWeek.Sunday ? 0 : Convert.ToDouble(date.DayOfWeek);

        return date.AddDays(-offset);

    }

}

 

 

internal static class BlogPostRepositoryExtensionMethods

{

    //*********************************************************************************

    // Sort Methods

    //*********************************************************************************

 

    // SortBy

    internal static ObjectQuery<BlogPost> SortBy( this ObjectQuery<BlogPost> query, Enums.BlogPostSortOption sortOption)

    {

        switch (sortOption)

        {

            case Enums.BlogPostSortOption.ByDate:

                return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);

            case Enums.BlogPostSortOption.BySite:

                return (ObjectQuery<BlogPost>)query.OrderBy(p => p.BlogProfile.BlogName);

            case Enums.BlogPostSortOption.ByVote:

                return query;

            default:

                return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);

        }

    }

 

 

    //*********************************************************************************

    // Projection Methods

    //*********************************************************************************

 

    // GetList

    internal static List<BlogPost> GetList(this ObjectQuery<BlogPost> query)

    {

        return query.ToList<BlogPost>();

    }

 

    // GetPage

    internal static List<BlogPost> GetPage(this ObjectQuery<BlogPost> query, int pageIndex, int pageSize)

    {

        int skipCount = pageSize * pageIndex;

        return query.Skip(skipCount).Take(pageSize).ToList<BlogPost>();

    }

 

    // GetBlogNames

    internal static List<String> GetBlogNames(this ObjectQuery<BlogPost> query)

    {

        var newQuery = from p in query

                       select p.BlogProfile.BlogName;  

        return newQuery.Distinct().ToList<String>();

    }

 

    // GetBlogNameListItemDTOs

    internal static List<ListItemDTO> GetBlogNameListItemDTOs(this ObjectQuery<BlogPost> query)

    {

        List<ListItemDTO> items = new List<ListItemDTO>();

        var newQuery = from p in query

                       select new {p.BlogProfile.BlogName, p.BlogProfile.BlogId};

        foreach (var item in newQuery.Distinct())

        {

         items.Add( new ListItemDTO{Text=item.BlogName, Value=item.BlogId.ToString()});

        }

        return items;

    }

 

}

3 comments:

  1. For the fluent interface, why are you using extension methods (static taking this as a parameter) instead of non-static methods returning "this" and storing the query in the class? That way you could change the order of the parameters without having to call GetAllPosts first.

    ReplyDelete
  2. The reason for extension methods is that it enables the pseudo fluent interface chaining syntax Method1.Method2.method3. It's just a style choice.

    ReplyDelete
  3. Good job man!
    I really like your implementation about Fluent Interface.

    ReplyDelete