Monday, December 28, 2009

How do you test CRUD methods? Just hit the database and be happy.

It’s time for a little programmer heresy.  For the last few months I’ve been reevaluating all of my “best practices” and throwing out anything that creates friction.  Friction is a term I’ve heard Jeff Atwood use fairly often.  It’s a fuzzy term but I think I understand what he’s getting at.  When you’re writing code and you’ve got some momentum going, any time you find yourself struggling against the persistence framework, any time you find yourself jumping through hoops to find that bit of code that’s throwing an error (and is probably called through reflection), any code that makes you scream inside but you have to write it to comply with some standard, pretty much anything that slows you down, that is friction.  Sometimes friction is justified. Most of the time it isn’t.  

Background

So I’ve been doing more and more unit testing lately.  One thing that I run up against often is how to test persistence (Create Read Update Delete) methods.  Below is a diagram of some persistence classes from an application I’m working on.  The classes below handle persistence for my SubmittedBlogUrl entity.  BlogService is the one stop shop that my application layer uses to do everything related to blogs. In Domain Driven Design parlance, the BlogService handles getting and saving data for all classes that are part of the Blog aggregate.  Getting and saving BlogProfiles, BlogPosts, and SubmittedBlogUrls all takes place through the BlogService class.  That doesn’t mean that BlogService contains all of the persistence logic though.  I use a strategy pattern that delegates the actual Delete, Insert, Save, and Update methods to separate persister classes. 

image

Now the code I want to test are the Delete, Insert, and Update methods in my SubmittedBlogUrlPersister class.  The methods look like this.

public class SubmittedBlogUrlPersister

    {

 

        public void Save(SubmittedBlogUrl dto)

        {

            if (dto.Id == NullValues.NullInt)

            {

                Insert(dto);

            }

            else

            {

                Update(dto);

            }

        }

 

        public void Insert(SubmittedBlogUrl dto)

        {

            SqlRepository repository = new SqlRepository();

            string sql = @"INSERT INTO [dbo].[SubmittedBlogUrl]

                                ([SubmittedByUserGuid]

                                ,[SubmittedOnUtc]

                                ,[IpAddress]

                                ,[Status]

                                ,[BlogUrl])

                            VALUES

                                (@SubmittedByUserGuid

                                ,@SubmittedOnUtc

                                ,@IpAddress

                                ,@Status

                                ,@BlogUrl)

                            SELECT SCOPE_IDENTITY()";

 

            SqlCommand command = repository.GetSqlCommand(sql);

            command.Parameters.Add(repository.CreateParameter("@SubmittedByUserGuid",
              dto.SubmittedByUserGuid));

            command.Parameters.Add(repository.CreateParameter("@SubmittedOnUtc"
              dto.SubmittedOnUtc));

            command.Parameters.Add(repository.CreateParameter("@IpAddress", dto.IpAddress, 20));

            command.Parameters.Add(repository.CreateParameter("@Status", dto.Status));

            command.Parameters.Add(repository.CreateParameter("@BlogUrl", dto.BlogUrl, 100));

            Object result = repository.ExecuteScalar(command);

            dto.Id = Convert.ToInt32(result);

        }

 

 

        public void Update(SubmittedBlogUrl dto)

        {

            SqlRepository repository = new SqlRepository();

            string sql = @"UPDATE [dbo].[SubmittedBlogUrl]

                           SET [SubmittedByUserGuid] = @SubmittedByUserGuid

                              ,[SubmittedOnUtc] = @SubmittedOnUtc

                              ,[IpAddress] = @IpAddress

                              ,[Status] = @Status

                              ,[BlogUrl] = @BlogUrl

                         WHERE Id = @Id";

 

            SqlCommand command = repository.GetSqlCommand(sql);

            command.Parameters.Add(repository.CreateParameter("@Id", dto.Id));

            command.Parameters.Add(repository.CreateParameter("@SubmittedByUserGuid"
              dto.SubmittedByUserGuid));

            command.Parameters.Add(repository.CreateParameter("@SubmittedOnUtc",
              dto.SubmittedOnUtc));

            command.Parameters.Add(repository.CreateParameter("@IpAddress", dto.IpAddress, 20));

            command.Parameters.Add(repository.CreateParameter("@Status", dto.Status));

            command.Parameters.Add(repository.CreateParameter("@BlogUrl", dto.BlogUrl, 100));

            repository.ExecuteNonQuery(command);

        }

 

 

        public void Delete(SubmittedBlogUrl dto)

        {

            SqlRepository repository = new SqlRepository();

            string sql = @"DELETE FROM [dbo].[SubmittedBlogUrl]

                           WHERE Id = @Id ";

 

            SqlCommand command = repository.GetSqlCommand(sql);

            command.Parameters.Add(repository.CreateParameter("@Id", dto.Id));

            repository.ExecuteNonQuery(command);

        }

 

    }

As you can see, these methods just create some parameterized T-SQL, package it in a command, and then pass the command off to my repository to be executed. The key thing that I need to test here is the query logic.  I need to make sure I didn’t make any mistakes when writing the SQL and creating parameters, and I need to make sure that the SQL works when run against my database.

Getting to the Tests

So how should I test this?  The TDD purists might say that I should fire up a mocking framework, create some mock objects of my repository and make sure that all right things are passed in.  I also might need to look at a Dependency Injection framework to make it easier to switch out my real SqlRepository with the mock SqlRepository.  And at the end of all this coding I’ll know what???  Well, I’ll just know that I passed a command to a mock object without anything blowing up.  I won’t know if my SQL syntax is right, I won’t know if my SQL works with my database, I won’t know anything that I actually need to know. I call this friction. Lots of effort that at the end of the day doesn’t even get me what I need, a valid test of the SQL in my persister class.

My solution, let’s dumb down the tests and get on with coding.  I’ll create a single CRUD test that instantiates a real SubmittedBlogUrlPersister (not a mock) then creates a new SubmittedBlogUrl object, saves it to the database, updates it, and finally deletes it.  If I make sure that I’m using a dev database (not production) and I make sure that my test cleans up after itself (deletes the data it creates) this should work just fine.  The resulting test looks like this.

[TestClass()]
public class SubmittedBlogUrlPersisterTest

{

    [TestMethod()]

    public void CrudTest()

    {

        // create our test object

        var persister = new SubmittedBlogUrlPersister();

        var service = new BlogService();

        var dto = new SubmittedBlogUrl();

        SubmittedBlogUrl dto2;

        var utcNow = DateTime.UtcNow;

        dto.BlogUrl = "testUrl";

        dto.IpAddress = "testIp";

        dto.SubmittedByUserGuid = Guid.NewGuid();

        dto.SubmittedOnUtc = utcNow;

        // insert it

        persister.Insert(dto);

        // get it

        dto2 = service.GetSubmittedBlogUrl(dto.Id);

        Assert.AreEqual(dto.Id, dto2.Id);

        // update it

        dto2.BlogUrl = "NewUrl";

        persister.Save(dto2);

        dto2 = service.GetSubmittedBlogUrl(dto.Id);

        Assert.AreEqual(dto2.BlogUrl, "NewUrl");

        // delete it

        persister.Delete(dto2);

        dto2 = service.GetSubmittedBlogUrl(dto.Id);

        Assert.IsNull(dto2);

    }

}

Technically, this isn’t a unit test, it’s an integration test since it’s hitting a real database, but I think it’s the right approach for this situation.  The principle here is that when you reach the point where you’re writing tests for methods that contain query logic (like CRUD methods), that might be the right place to ditch the unit tests and switch over to some simple integration tests that write to a real database and test what you really need tested, your query logic.

Addendum

Great comment that I wanted to draw attention to from Steve Freeman who co-authored the book Growing Object-Oriented Software, Guided by Tests.  Steve said..

As a TDD purist who "wrote the book" on Mock Objects and once wrote a foolish paper on mocking out JDBC, I would say that testing against a real database is the right thing to do here (provided you've set up your environment to do this cleanly). The other right thing is to only need such tests at the boundaries of your application, the persistence code should be nicely contained and not leak into the domain code.

That makes a lot of sense to me.  I might be off a bit on what I expect to hear from TDD purists.  I like Steve’s statement that you need integration tests like this only at the boundaries of your application.  In my example, if I’m testing app logic in an MVC controller class that uses my BlogService for persistence, then it makes sense to mock my BlogService and write real unit tests because I really just want to test the logic in my controller.  But at the edge of the application, where I’m touching the database, it’s probably more appropriate to switch over to integration tests.  I think a good rule of thumb is that if the code you’re testing contains any query logic (SQL, LINQ, etc..), go ahead and hit the db.

Thursday, September 10, 2009

Angry-Coders: Stupid Programmer Tricks

I’ve done it, you’ve probably done it, at some point we’ve all done it.  So let’s just agree on one simple principle.  Whether you’re leaving for a week, a day, or a couple of hours, don’t check in any code right before you go.  You will break the build.  It’s like some immutable law of nature.

2009-09-06_800

Read more Angry-Coders >>

Tuesday, September 8, 2009

Wednesday, September 2, 2009

Angry-Coders Comic: The Evolution of Tech-Guy Business Attire

I’m a big advocate of casual work attire for programmers.  However, I’ve also decided that looking at funky, Bilbo-Baggins toe hair is not having a positive impact on my work life.   -rudy

2009-09-02_800

Read more Angry-Coders >>

Monday, August 31, 2009

Angry-Coders Strip #1

In addition to falling further and further behind on the .Net blog, I’ve been playing with the idea of doing a comic strip.  I eventually want to try for a concept that’s like Dilbert but more mean-spirited.  I have around 40 strips storyboarded but it takes a flippin long time to draw these things. I’m going to try and get a new strip out once a week for the next month or two.  I also have a blog series in mind where I go step by step through the process of building a blog aggregator using ASP.Net MVC and Entity Framework.  -rudy

2009-08-31

Read more Angry-Coders >>

Thursday, August 27, 2009

Must Read Books for Senior C# Developers

I keep hearing talk about how “dead-tree books” are going away and have been replaced by the wealth of immediately available information on the internet.  Just to make my position clear, that’s a bunch of crap.  I do agree that the internet is the best tool for finding quick reference information, however, that river may be wide, but it doesn’t run deep. When you’re in the early stages of your career, and you’re trying to figure out the right code syntax or how to use some API, the internet is your best option.  But, when you’re further along, you know how to program, and now you’re looking for in-depth information on how to architect applications, bring a legacy code base under test, or even just a good solid coverage of design patterns, it’s time to dig into those “dead-tree books”.  

I read about 40 technical books a year, most of them through inter-library loan with the local universities.  Below are a few books that I consider must reads for any .Net programmer whose serious about crafting software.

image

Agile Principles, Patterns, and Practices in C#  by Robert C. Martin and Micah Martin
This is probably the single most important programming book that I’ve ever read.  It is a must must must read for any programmer who cares about crafting code and is ready to move their professional skills to the next level. Through stories, pictures, and some well written instructional text, Robert Martin covers architecture, design patterns, TDD, and most importantly, he demonstrates the process that highly skilled software developers use to design code. 

 

imageDomain Driven Design by Eric Evans
This is the book that really started it all.  If you’ve ever heard the terms POCO (POJO), DTO, Repository, Aggregate, or Ubiquitous Language and wondered what the heck those people were talking about, this is the book with the answers.  Eric Evans really lays out a road map for where I think we’re going with software development.  The examples are in Java, but if that’s a problem then you’re probably not ready for this book anyway.

 

image

Applying Domain Driven Design and Patterns by Jimmy Nilsson
Another fantastic DDD book.  Jimmy Nilsson builds on top of the foundation laid by Eric Evans and provides a more hands-on, example driven treatment of how to design and build an application using DDD.  Even if you never plan to use DDD, this book is full of architecture and software development examples that will make you a better developer. Plus, all of the code examples in this one are written in C#.

 

image

Working Effectively with Legacy Code by Michael Feathers
There are a lot of books out there that tell you how to write unit tests, and even how to write code that is testable. We need to know that stuff, but it doesn’t have much practical application for most of us unless we’re on a green field project.  Michael Feathers book is the only one I’ve seen that is written for the rest of us.  His book assumes that you’re starting with a sprawling, tangled mass of legacy code that you really don’t quite understand, and which you inherited from 10 other programmers who no longer work for the company. He then takes you through a catalog of hands-on techniques that you can use to safely refactor pieces of that codebase, nail down the functionality of those pieces with high level tests, and then bit by bit, safely bring that monster codebase under unit testing. 

Four books is probably enough for now.  There are a lot more (like John Skeet’s C# book) but these are the ones that I think are most important and relevant to the direction that .Net development is going.  So if you haven’t had a chance to read any of these, dust off your library card or your Amazon account and give them a look.  See if you agree that these books provide something that you just can’t find in a Google search.

Sunday, August 2, 2009

More on Fluent Interface Pattern for Composing Entity Framework Queries

Recap of Last Week

In last week's post I went over an Entity Framework pattern that I’ve been using that’s saved me quite a bit of time in my Repository classes.  The basic idea is that most Entity Framework queries can be broken down into 3 pieces as shown in the diagram below.

image The interesting part emerges when we realize that an EF query is just an object of type ObjectQuery, and that we can create methods that take an ObjectQuery as a parameter, do their processing, and then pass back a return value of type ObjectQuery. This means that we can chain the methods together, fluent interface style, to create queries that look like this:

List<BlogPost> postSet = GetBlogPostSet().SortBy(“Date”).GetPage(pageIndex, pageSize);

List<BlogPost> postSet = GetBlogPostSet().SortBy(“ID”).GetPage(pageIndex, pageSize);

List<BlogPost> postSet = GetBlogPostSet().SortBy(“ID”).GetAll();

Shaping Data With Projection Methods

This week I want to take a closer look at the Projection Method and how we can use it to shape our return data.  So let’s start out by looking at a pretty standard query that returns a list of BlogPost entities.

List<BlogPost> posts = GetAllBlogPosts().SortBy(sortOption).GetList();

This query contains each of the three components described above, a Filter method (GetAllBlogPosts), a Sort method (SortBy), and a Projection method (GetList). The code for each of the methods is listed below.  Note that our Projection Method, GetList(), is just running toList() on the query and returning the resulting List<BlogPost>.

// GetBlogPostSetByDate

private ObjectQuery<BlogPost> GetAllBlogPosts()

{

    var query = from p in Context.BlogPostSet.Include("Categories")

                select p;

    return (ObjectQuery<BlogPost>)query;

}

// SortBy

internal static ObjectQuery<BlogPost> SortBy( this ObjectQuery<BlogPost> query, Enums.BlogPostSortOption sortOption)

{

    switch (sortOption)

    {

        case Enums.BlogPostSortOption.ByDate:

            return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);

        case Enums.BlogPostSortOption.BySite:

            return (ObjectQuery<BlogPost>)query.OrderBy(p => p.BlogProfile.BlogName);

        case Enums.BlogPostSortOption.ByVote:

            return query;

        default:

            return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);

    }

}

// GetList

internal static List<BlogPost> GetList(this ObjectQuery<BlogPost> query)

{

    return query.ToList<BlogPost>();

}

Wonderful, we can mix an match methods like this and they work great as long as we want to return BlogPost entities. But what if we just want a list of the BlogNames?  How do we handle that? It seems like a waste to pull a list of full BlogPost entity objects when all we really need is the BlogName.  In this case it would be nice to shape my return data into a simple list of BlogNames.

I typically handle this in my Projection Methods.  I already have a Filter Method that’s applying the appropriate filter, I have a Sort method that’s applying the appropriate sort, now I need a Projection Method that projects my results onto the appropriate data shape.  So, I need to create a new projection method to do this called GetBlogNames().  My new data access code and the new GetBlogNames() method look like this.

List<String> blogNames = GetAllBlogPosts().SortBy(sortOption).GetBlogNames();

 

// GetBlogNames

internal static List<String> GetBlogNames(this ObjectQuery<BlogPost> query)

{

    var newQuery = from p in query

                   select p.BlogProfile.BlogName;  

    return newQuery.Distinct().ToList<String>();

}

Look closely and you’ll see that the GetBlogNames() method does something kind of interesting.  It does a second select on my query.  We did the first select back in our Filter Method GetAllBlogPosts().  That select returned full BlogPost entities.  Now in my projection method I decide that I only need the BlogNames property so I do a second select that returns only the BlogNames. 

At first glance it may seem that this is still doing something really inefficient.  We have an initial query that pulls full entity objects, then we do a second select against that result set and it get’s the BlogNames.  We’re still pulling full entities in the first query, aren’t we?

No we’re not. Thanks to delayed execution our query never does pull full entities.  This goes back to the big realization that the query is now an object.  I’m a little fuzzy on some of the details here, but it goes something like this.  When we create an Entity Framework query, we ‘re creating an ObjectQuery that contains a Command Tree.  That’s important.  The ObjectQuery contains a Command Tree that represents each of the commands (like select, order by, from, etc.) in our query.  When we need the results of the ObjectQuery, like when we call ToList() or when we iterate over the results, our Command Tree gets translated into TSQL (or whatever is used by the provider we’re querying) and the query is executed. 

So, right up to the point that I call ToList(), my query is just a data structure containing all of my commands, and adding another select is just adding another command to the Command Tree.  At query time, all of my commands will be translated into SQL and executed as a single query that returns just BlogNames.

A Better Data Shaping Example

We’ve seen how we can use a select in our projection method to shape data.  Now let’s apply that principle to a more practical, real world, example, drop down list choices.  How many times have you seen code that pulled down a list of full entity objects, or a fully populated DataTable, just to populate the choices of a drop down list.  This scenario comes up all the time so I’ve added a ListItemDTO Data Transfer Object to my BLL (for more on DTOs see What is the difference between a DTO and a POCO). My ListItemDTO class is just a simple data container for a list item. 

public class ListItemDTO

{

    public String Text { get; set; }

    public String Value { get; set; }

}

Now, for this example let’s assume that in my UI, I have a page that displays a grid of all blog posts, for all blogs, that were posted in a given week.  Now I want to add a DropDownList to the top of this page that allows me to pick a blog and then the grid will display only the posts from that blog.  Simple enough scenario.  I just need to create a DropDownList that contains ListItems with Text equal to BlogName,  and values equal to BlogId.  To get the data to populate this ddl, I need a method in my Repository that gets the list of BlogPosts for the given week, but then shapes the return value as a distinct list of ListItemDTOs where the Text property is BlogName and the Value property is BlogId. 

For us this is no problem.  We already have the Filter and Sort methods needed to get the data, we just need a Projection method that will shape it as a List<ListItemDTO>. 

// GetBlogNameListItemDTOs

internal static List<ListItemDTO> GetBlogNameListItemDTOs(this ObjectQuery<BlogPost> query)

{

    List<ListItemDTO> items = new List<ListItemDTO>();

    var newQuery = from p in query

                   select new {p.BlogProfile.BlogName, p.BlogProfile.BlogId};

    foreach (var item in newQuery.Distinct())

    {

        items.Add( new ListItemDTO{Text=item.BlogName, Value=item.BlogId.ToString()});

    }

    return items;

}

In this Projection method, we added a new select that modified our query to return only the BlogName and the BlogId.  We then used a foreach loop to iterate over the results of the query, create a new ListItemDTO for each item, and then return the List<ListItemDTO>.  There are two details to make note of.  First we modified our query to return it’s data as an an anonymous type with two properties, BlogName and BlogId.  That’s the reason for the “new {p.BlogProfile.BlogName, p.BlogProfile.BlogId}” object syntax.  Second, when we iterate over the query in the foreach loop, we call the Distinct() method which, wait for it, makes sure our results are distinct. 

So, that was pretty painless. We can now write a data access method that leverages our existing Filter and Sort methods, and uses our new GetBlogNameListItemDTOs() to project the results onto a List<ListItemDTO>.  The code is given below.

// BlogNameListItemDtosForWeek

public List<ListItemDTO> BlogNameListItemDTOsForWeek(DateTime startDate)

{

  return GetBlogPostSetForWeek(startDate)

         .SortBy(Enums.BlogPostSortOption.BySite)

         .GetBlogNameListItemDTOs();

}

Conclusion

Hopefully this illustrates a little more of the potential of this pattern.  Now that I’ve been using it for a while I can’t bring myself to structure my Entity Framework Repositories any other way.  There are always situations where the model doesn’t quite work for me, and when that happens I just go back to writing Entity Framework queries from scratch.  But, so far I’ve been pleasantly surprised by how often my data access methods fit neatly into this little time saving model.  If you have suggestions on how to improve the pattern, or other data access patterns that have saved you time, please leave a comment or drop me an email.

For anyone who wants to see the Filter/Sort/Projection model in the context of a repository class, I’m including a partial listing of my BlogPostRepository below.  You’ll see everything we’ve covered over the last two posts, plus there’s some additional code that handles things like caching.

 

public class BlogPostRepository : RepositoryBase

{

 

    // Constructors

    public BlogPostRepository() { }

    public BlogPostRepository(RAEntities context)

    {

        this.Context = context;

    }

 

 

    //*********************************************************************************

    // Data Access Methods

    //******************************************************************************** 

 

    // BlogPostSetForWeek_GetPage

    public List<BlogPost> BlogPostSetForWeek_GetPage(int pageIndex, int pageSize, DateTime startDate, Enums.BlogPostSortOption sortOption, CacheOptions cacheOptions)

    {

        startDate = ToSunday(startDate);

        CacheHelper cache = new CacheHelper();

        string cacheKey = String.Format("BlogPostSetByWeek_GetPage_{0}_{1}_{2}_{3}", pageIndex.ToString(), pageSize.ToString(), startDate.ToShortDateString(), sortOption.ToString());

        // If the cache contains the data, and the user wants us to use

        // cache, then return the cached data

        if (cache.Contains(cacheKey) && cacheOptions.UseCache)

        {

            return (List<BlogPost>)cache.Get(cacheKey);

        }

        // If we fell through to this point then we need to pull data.

        List<BlogPost> postSet = GetBlogPostSetForWeek(startDate)

                                 .SortBy(sortOption)

                                 .GetPage(pageIndex, pageSize);

        if (cacheOptions.UseCache)

        {

            cache.Insert(postSet, cacheKey);

        }

        return postSet;

    }

 

 

    // BlogNamesForWeek

    public List<String> BlogNamesForWeek(DateTime startDate)

    {

        return GetBlogPostSetForWeek(startDate)

               .SortBy(Enums.BlogPostSortOption.BySite)

               .GetBlogNames();

    }

 

 

    // BlogNameListItemDtosForWeek

    public List<ListItemDTO> BlogNameListItemDTOsForWeek(DateTime startDate)

    {

        return GetBlogPostSetForWeek(startDate)

               .SortBy(Enums.BlogPostSortOption.BySite)

               .GetBlogNameListItemDTOs();

    }

 

 

    //*********************************************************************************

    // Filter Methods

    //*********************************************************************************   

    // GetBlogPostSetByDate

    private ObjectQuery<BlogPost> GetAllBlogPosts()

    {

        var query = from p in Context.BlogPostSet.Include("Categories").Include("BlogProfile")

                    select p;

        return (ObjectQuery<BlogPost>)query;

    }

 

    // GetBlogPostSetForWeek

    private ObjectQuery<BlogPost> GetBlogPostSetForWeek(DateTime startDate)

    {

        startDate = ToSunday(startDate);

        DateTime startUtc = startDate.Date;

        DateTime endUtc = startDate.AddDays(7).Date;

        var query = from p in Context.BlogPostSet.Include("Categories").Include("BlogProfile")

                    where p.PostedUtc > startUtc & p.PostedUtc < endUtc

                    select p;

        return (ObjectQuery<BlogPost>)query;

    }

 

 

 

 

    //*********************************************************************************

    // Utility Methods

    //*********************************************************************************

 

    public static DateTime ToSunday(DateTime date)

    {

        double offset = date.DayOfWeek == DayOfWeek.Sunday ? 0 : Convert.ToDouble(date.DayOfWeek);

        return date.AddDays(-offset);

    }

}

 

 

internal static class BlogPostRepositoryExtensionMethods

{

    //*********************************************************************************

    // Sort Methods

    //*********************************************************************************

 

    // SortBy

    internal static ObjectQuery<BlogPost> SortBy( this ObjectQuery<BlogPost> query, Enums.BlogPostSortOption sortOption)

    {

        switch (sortOption)

        {

            case Enums.BlogPostSortOption.ByDate:

                return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);

            case Enums.BlogPostSortOption.BySite:

                return (ObjectQuery<BlogPost>)query.OrderBy(p => p.BlogProfile.BlogName);

            case Enums.BlogPostSortOption.ByVote:

                return query;

            default:

                return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);

        }

    }

 

 

    //*********************************************************************************

    // Projection Methods

    //*********************************************************************************

 

    // GetList

    internal static List<BlogPost> GetList(this ObjectQuery<BlogPost> query)

    {

        return query.ToList<BlogPost>();

    }

 

    // GetPage

    internal static List<BlogPost> GetPage(this ObjectQuery<BlogPost> query, int pageIndex, int pageSize)

    {

        int skipCount = pageSize * pageIndex;

        return query.Skip(skipCount).Take(pageSize).ToList<BlogPost>();

    }

 

    // GetBlogNames

    internal static List<String> GetBlogNames(this ObjectQuery<BlogPost> query)

    {

        var newQuery = from p in query

                       select p.BlogProfile.BlogName;  

        return newQuery.Distinct().ToList<String>();

    }

 

    // GetBlogNameListItemDTOs

    internal static List<ListItemDTO> GetBlogNameListItemDTOs(this ObjectQuery<BlogPost> query)

    {

        List<ListItemDTO> items = new List<ListItemDTO>();

        var newQuery = from p in query

                       select new {p.BlogProfile.BlogName, p.BlogProfile.BlogId};

        foreach (var item in newQuery.Distinct())

        {

         items.Add( new ListItemDTO{Text=item.BlogName, Value=item.BlogId.ToString()});

        }

        return items;

    }

 

}

Tuesday, July 28, 2009

Fluent Interface Pattern for Composing Entity Framework Queries

I’ve been doing a fair amount of work with Entity Framework recently.  There are some things about EF that make me want to throw it out the window, but this post is about something that I really like, the ability to eliminate redundant code from my BLL and DLL and create a fluent interface for composing queries.

The problem we want to solve

So here’s a typical scenario.  I have a blog aggregator application that I’m building.  I use Entity Framework to create a BlogPost entity and it’s data mappings. Great, now I’m ready to create a BlogRepository class that will contain all of my queries for getting Blog posts.  So I write the first data access method and it looks something like this.

public List<BlogPost> BlogPostSetByWeek_GetPage(int pageIndex, int pageSize, DateTime startDate)

{

    startDate = ToSunday(startDate);        

    DateTime startUtc = startDate.Date;

    DateTime endUtc = startDate.AddDays(7).Date;

    int skipCount = pageSize * pageIndex;

    var query = from p in Context.BlogPostSet.Include("Categories")

                where p.PostedUtc > startUtc & p.PostedUtc < endUtc

                orderby p.PostedUtc descending

                select p;

    List<BlogPost> postSet = query.Skip(skipCount).Take(pageSize).ToList<BlogPost>();

    return postSet;

}

The above method takes a startDate and some paging parameters and then returns the specified page of results in the shape of a generic list of BlogPost entities.  How easy was that!! 

Now for the next step.  I need a query that’s exactly like the query above but this time I want the entire list of results instead of just a page.  And after that I need another query that sorts the BlogPosts by Category instead of by PostedUtc, and then I need another that sorts by the BlogName, and on and on and on.  So how do I handle this??  I could just create a completely new EF query for each one of these.  Or maybe I could use EntitySQL instead of Linq to Entities and then I would be able to use a bunch of conditional blocks to create the EntitySQL text that I need….. Neither of those solutions really appeals to me.  First, I don’t like the idea of rewriting the same query over and over with minor differences in criteria or sort order.  That just seems inefficient.  Second I don’t really want to use EntitySQL because I like the strong typing that I get with Linq to Entities, plus I would need a lot of conditionals to handle all of the possible query combinations and that sounds like a mess.

The Solution

So I was thinking about how much I hate duplicating the same query code over and over when I realized something.  Microsoft has made the query an object. I didn’t really appreciate the significance of that before.  The query is no longer just text, it is now an object, an ObjectQuery<> object to be precise.  The cool part is that if I write methods that take an ObjectQuery as their parameter and then return an ObjectQuery for their return value,  I can chain them together and use them to compose queries.

How could this work?  I looked at the queries in my BLL and found that each of them consists of 3 major components:

image

Looking at this break down, I realized that I could have a Filter Method that creates an ObjectQuery that gets the data I’m looking for, then I could pass that ObjectQuery to a  Sort Method that applies a sort then returns the modified ObjectQuery, then I could pass that to a Projection Method that applies paging, shapes the data, and executes the ObjectQuery. 

So, when all this is said and done I should be able to compose Entity framework queries by combining a Filter Method, a Sort Method, and a Projection Method.  The end result should be data access code that looks like this:

List<BlogPost> postSet = GetBlogPostSet().SortBy(“Date”).GetPage(pageIndex, pageSize);

List<BlogPost> postSet = GetBlogPostSet().SortBy(“ID”).GetPage(pageIndex, pageSize);

List<BlogPost> postSet = GetBlogPostSet().SortBy(“ID”).GetAll();

Building an Example

So, I coded it up and it works pretty well.  The first step is creating a Filter Method.  This method takes search criteria as parameters and returns an ObjectQuery. Below is my filter method for getting the BlogPost entities for a given week. 

// GetBlogPostSetForWeek

private ObjectQuery<BlogPost> GetBlogPostSetForWeek(DateTime startDate)

{

    startDate = ToSunday(startDate);

    DateTime startUtc = startDate.Date;

    DateTime endUtc = startDate.AddDays(7).Date;

    var query = from p in Context.BlogPostSet.Include("Categories")

                where p.PostedUtc > startUtc & p.PostedUtc < endUtc

                select p;

    return (ObjectQuery<BlogPost>)query;

}

Now I need to create my Sort Method. This method will take the results of my Filter Method as a parameter, along with an enum that tells the method what sort to apply. Note that I’m using strongly typed object queries of type ObjectQuery<BlogPost>.  The strong typing serves two purposes.  First it lets my Sort Method know that I’m dealing with BlogPost entities which tells me what fields are available to sort by.  Second, the stong typing provides a distinct method signature so I can have multiple methods called SortBy which all handle ObjectQueries that return different types of entities.  I can have a SortBy( ObjectQuery<BlogPost>), SortBy(ObjectQuery<Person>), etc.  

One other thing.  I want to chain these methods together, fluent interface style.  For that reason I’m implementing both SortBy and my GetPage as extension methods. Here’s the code for the SortBy method.

// SortBy

internal static ObjectQuery<BlogPost> SortBy( this ObjectQuery<BlogPost> query, Enums.BlogPostSortOption sortOption)

{

    switch (sortOption)

    {

        case Enums.BlogPostSortOption.ByDate:

            return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);

        case Enums.BlogPostSortOption.BySite:

            return (ObjectQuery<BlogPost>)query.OrderBy(p => p.BlogProfile.BlogName);

        case Enums.BlogPostSortOption.ByVote:

            return query;

        default:

            return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);

    }

}

Lastly we need to create a Projection Method.  Below is the GetPage  method.  It takes the ObjectQuery<BlogPost> from the SortBy method, applies paging logic to it, executes the query, then returns the results as a List<BlogPost>. 

// GetPage

internal static List<BlogPost> GetPage(this ObjectQuery<BlogPost> query, int pageIndex, int pageSize)

{

    int skipCount = pageSize * pageIndex;

    return query.Skip(skipCount).Take(pageSize).ToList<BlogPost>();

}

So that’s it.  I now have all the pieces needed to create my data access methods without duplicating query logic over and over.  If I want all blog posts ordered by date, I can use the code:

  Enums.BlogPostSortOption sort = Enums.BlogPostSortOption.ByDate;

  return GetBlogPostSetForWeek(startDate).SortBy(sort).GetPage(pageIndex, pageSize);

To sort those same results by BlogName I can use the code:

  Enums.BlogPostSortOption sort = Enums.BlogPostSortOption.BySite;

  return GetBlogPostSetForWeek(startDate).SortBy(sort).GetPage(pageIndex, pageSize);

If I want to get BlogPosts by category instead of by week, I just write a new filter method named GetBlogPostSetForCategory and it plugs right in:

  return GetBlogPostSetForCategory(category).SortBy(sort).GetPage(pageIndex, pageSize);

Conclusion

So that's it.  This technique has significantly reduced the amount of data access code in my Repository classes and the time that it takes to write it.  I also like the fact that I’m not writing the same paging and sorting code over and over in different queries.  If you see any advantages or disadvantages to the technique, please leave a comment and let me know what you think.  Also, if you’re aware of anyone else using a similar method, please send me a link at rlacovara@gmail.com, I would like to check it out.

Saturday, July 11, 2009

Chaining the C# Null Coalescing Operator

This is something that came up in the comments last week.  I was refactoring some code and wound up with the  accessor method below that performs multiple null checks while trying to assign a value to _currentUser.

// CurrentUser

 private WebUser _currentUser;

 public WebUser CurrentUser

{

   get

   {

     if (_currentUser == null) _currentUser = GetWebUserFromSession();

     if (_currentUser == null) _currentUser = GetWebUserFromTrackingCookie();

     if (_currentUser == null) _currentUser = CreateNewWebUser();

     return _currentUser;

   }

}

Now this code is pretty clear, but a reader named Brian pointed out in a comment that we could shorten the code a bit by using the Null Coalescing operator “??”.  If you haven’t used the Null Coalescing operator yet, check it out.  It’s great for data access code where a method may return data or a null value.  The basic syntax is illustrated in this block of code from msdn:

   // y = x, unless x is null, in which case y = -1.
   int y = x ?? -1;

   // Assign i to return value of method, unless
   // return value is null, in which case assign
   // default value of int to i.
   int i = GetNullableInt() ?? default(int);

So, this is really cool, but there’s more. What Brian pointed out was that you can chain the ?? operator to do multiple null checks.  So, my block of code above can be rewritten like this:

// CurrentUser

 private WebUser _currentUser;

 public WebUser CurrentUser

{

   get

   {

     _currentUser = _currentUser ??

                    GetWebUserFromSession() ??

                    GetWebUserFromTrackingCookie() ??

                    CreateNewWebUser();

     return _currentUser;

   }

}

Note that normally the _currentUser assignment would fit all on one line but due to the width limitation of my blog I broke it up into multiple lines.  So, C# Null Coalescing operator chaining. I like it. I’ll be adding it to my toolbox.

 

Addendum:

Hey, you guys pretty much hated this change to the code. I got no blog comments but I received quite a few emails about this technique and the prevailing opinion seems to be that the original syntax with the multiple if blocks was clearer.  In fact, not a single person who emailed me liked the ?? method. So I took another look and I think I agree.  The original code does seem a little clearer to me and I don’t really think I was able to make my code any shorter or simpler by using ??.  So this may not have been the best example, but I do still think this is a very cool technique for the right situation. As always, I just need to make sure that I’m refactoring because the changes make the code cleaner, not just because they’re clever.

Friday, July 3, 2009

SOLID C# Code: Smaller Methods == Clean Code?

I’m a big fan of Robert Martin.  His book on Agile Patterns in C# is still one of the three most important programming books I’ve ever read.  If you listen to Uncle Bob for any amount of time, it won’t be long before you start hearing terms like SOLID Principles and Clean Code.  These are concepts that are widely known, but still a bit tough to define.  What exactly is clean code?  What does SOLID code look like?

We’ll I’m not smart enough to provide a definition that’s any better than what’s already out there, but I can point at an example and say “that smells pretty SOLID”.  So, below is an example of some code that I wrote for an ASP.Net MVC application.  CurrentUser is a property that wraps an instance of my WebUser class, which represents, you guessed it, the currently logged in user for my web app. When the CurrentUser property returns a user, there are four possible places that it might have to check to find that user:

  1. There may already be a WebUser instance in the private member _currentUser,
  2. There may be a WebUser instance stored in the Session,
  3. There may be a TrackingCookie in the HTTP Request that can be used to get an existing WebUser,
  4. We may have none of the above, in which case we have to create a new WebUser.

So, with that said, let’s take a look at my first version of this code.

// CurrentUser

private WebUser _currentUser;

public WebUser CurrentUser

{

    get

    {

        // Do we already have the CurrentUser?

        if (_currentUser == null)

        {

            // Try to get the user from Session

            Object obj  = HttpContext.Current.Session["__currentUser"];

            if (obj != null)

            {

                _currentUser = (WebUser)obj;

                return _currentUser;

            }

            // Try to get the user from a TrackingCookie

            SecurityHelper secHelper = new SecurityHelper();

            WebUserRepository rep = new WebUserRepository();

            if (secHelper.TrackingGuid != Guid.Empty)

            {                      

                _currentUser = rep.GetWebUserByTrackingGuid(secHelper.TrackingGuid);

                if (_currentUser != null) return _currentUser;

            }

            // If we still don't have a user then we need to create a new

            // WebUser with a new TrackingGuid.

            WebUserFactory factory = new WebUserFactory();

            _currentUser = factory.CreateWebUser();

        }

        return _currentUser;

    }

}

 

 

Hmmmmmm. Not too horrible, not too long and ungainly, but it definitely has some code smell (the bad kind).  First, I’m not thrilled about the multiple returns nested in conditional blocks. Also, the code is doing a number of different things that I felt needed comments to explain.  Now I love comments, and I’m a firm believer that when in doubt, comment.  But I’ve also come to realize that it is possible to design code in such a way that it can be just as clear, and just as understandable, without the need for comments. 

So how do we reach this promised land of understandability? The main technique that I’ve been using is simply to take my big blocks of code that do several different things, and break them up into several separate, clearly named methods, and then just call them from my main block. Here’s how I applied this technique to my CurrentUser property.

 

 

// CurrentUser

 private WebUser _currentUser;

 public WebUser CurrentUser

{

   get

   {

     if (_currentUser == null) _currentUser = GetWebUserFromSession();

     if (_currentUser == null) _currentUser = GetWebUserFromTrackingCookie();

     if (_currentUser == null) _currentUser = CreateNewWebUser();

     return _currentUser;

   }

}

 

 

// GetWebUserFromSession

private WebUser GetWebUserFromSession()

{

     Object obj = HttpContext.Current.Session["__currentUser"];

     return obj == null ? null : (WebUser)obj;

}

 

// GetWebUserFromTrackingCookie

private WebUser GetWebUserFromTrackingCookie()

{

     SecurityHelper secHelper = new SecurityHelper();

     WebUserRepository rep = new WebUserRepository();

     if (secHelper.TrackingGuid == Guid.Empty)

         return null;

     else

         return rep.GetWebUserByTrackingGuid(secHelper.TrackingGuid);

}

 

// CreateNewWebUser

private WebUser CreateNewWebUser()

{

     WebUserFactory factory = new WebUserFactory();

     return factory.CreateWebUser();

}

 

Now I realize that in this second version I wrote more code than the first, and I do a couple of extra null checks in my main property code, but look at how easily understandable everything is.  I mean really. There’s not a single comment, but can you imagine anybody not understanding exactly what’s going on in this block?

   get

   {

     if (_currentUser == null) _currentUser = GetWebUserFromSession();

     if (_currentUser == null)  _currentUser = GetWebUserFromTrackingCookie();

     if (_currentUser == null)  _currentUser = CreateNewWebUser();

     return _currentUser;

   }

Plus, because I factored out methods like GetWebUserFromSession() and GetWebUserFromTrackingCookie() I can now use those methods in other parts of my class without having to rewrite the functionality.  So overall, I think this version smells much more SOLID.  What do you think?  If anyone has ideas or favorite techniques for how to get more SOLID, please leave a comment.