Showing posts with label LINQ. Show all posts
Showing posts with label LINQ. Show all posts

Here is an interesting problem, try writing the bellow Linq query using lambda expressions.

var usersWithBigOrders =
  from usr in context.Users
  from ordr in usr.Orders
  where ordr.Total > 10
  select usr;

I ran into this interesting Linq question when writing yesterday's post. To figure out what the above query actually does I used Reflector. The code bellow is what the C# compiler will generate (when reverted back from IL to C#):

context.Users.SelectMany(
Expression.Lambda<Func<User, IEnumerable<Order>>>(
  Expression.Property(CS$0$0000 = Expression.Parameter(typeof(User), "user"),
   (MethodInfo) methodof(User.get_Orders)), new ParameterExpression[] { CS$0$0000 }),
    Expression.Lambda(
      Expression.New((ConstructorInfo) methodof(<>f__AnonymousType0<User, Order>..ctor,
       <>f__AnonymousType0<User, Order>), 
       new Expression[] { CS$0$0000 = Expression.Parameter(typeof(User), "user"), 
       CS$0$0002 = Expression.Parameter(typeof(Order), "order") }, 
       new MethodInfo[] { (MethodInfo) methodof(<>f__AnonymousType0<User, Order>.get_user, 
       <>f__AnonymousType0<User, Order>), 
       (MethodInfo) methodof(<>f__AnonymousType0<User, Order>.get_order, 
       <>f__AnonymousType0<User, Order>) }), 
       new ParameterExpression[] { CS$0$0000, CS$0$0002 }))
.Where(Expression.Lambda(Expression.GreaterThan(
  Expression.Property(Expression.Property(
    CS$0$0000 = Expression.Parameter(typeof(<>f__AnonymousType0<User, Order>),
     "<>h__TransparentIdentifier0"),
      (MethodInfo) methodof(<>f__AnonymousType0<User, Order>.get_order,
       <>f__AnonymousType0<User, Order>)), (MethodInfo) methodof(Order.get_Total)), 
       Expression.Convert(Expression.Constant(10, typeof(int)), typeof(int?))), 
       new ParameterExpression[] { CS$0$0000 }))
        .Select(Expression.Lambda(Expression.Property(
          CS$0$0000 = Expression.Parameter(typeof(<>f__AnonymousType0<User, Order>),
           "<>h__TransparentIdentifier0"), 
           (MethodInfo) methodof(<>f__AnonymousType0<User, Order>.get_user,
            <>f__AnonymousType0<User, Order>)), 
            new ParameterExpression[] { CS$0$0000 })).ToList<User>();

It is quite surprising how much code that the compiler actually generates. The reason for the amount of code above is because the LinqToSql table class implements IQueryable. When the compiler detects this interface instead of generating IL that performs the query it will generate IL that builds up an expression tree that describes the query.

So in order to understand what the above code actually describes I wrote the same Linq query but instead of querying LinqToSql I queried a normal .NET collection. The code that Reflector generates now looks like this:

List<User> users = new List<User>();
if (CS$<>9__CachedAnonymousMethodDelegate6 == null)
{
    CS$<>9__CachedAnonymousMethodDelegate6 = delegate (User usr) {
        return usr.Orders;
    };
}
if (CS$<>9__CachedAnonymousMethodDelegate7 == null)
{
    CS$<>9__CachedAnonymousMethodDelegate7 = delegate (User usr, Order ordr) {
        return new { usr = usr, ordr = ordr };
    };
}
if (CS$<>9__CachedAnonymousMethodDelegate8 == null)
{
    CS$<>9__CachedAnonymousMethodDelegate8 = delegate (<>f__AnonymousType0<User, Order> <>h__TransparentIdentifier0) {
        return <>h__TransparentIdentifier0.ordr.Total > 10;
    };
}
if (CS$<>9__CachedAnonymousMethodDelegate9 == null)
{
    CS$<>9__CachedAnonymousMethodDelegate9 = delegate (<>f__AnonymousType0<User, Order> <>h__TransparentIdentifier0) {
        return <>h__TransparentIdentifier0.usr;
    };
}
IEnumerable<User> usersWithBigOrders = users
  .SelectMany(CS$<>9__CachedAnonymousMethodDelegate6, CS$<>9__CachedAnonymousMethodDelegate7)
  .Where(CS$<>9__CachedAnonymousMethodDelegate8)
  .Select(CS$<>9__CachedAnonymousMethodDelegate9);

This is a lot more understandable, but still it took a while to figure out exactly what the above code was doing. Here are both queries, one using Linq syntax the other using the lambda syntax, they are equivalent.

var usersWithBigOrders =
  from usr in context.Users
  from ordr in usr.Orders
  where ordr.Total > 10
  select usr;

var usersWithBigOrders = context.Users
  .SelectMany(user => user.Orders,(user, order) => new {User = user, Order = order})
  .Where(anonType => anonType.Order.Total > 10)
  .Select(anonType => anonType.User);

One of the biggest strengths of O/R mappers is that many have object-oriented "query by criteria" API. I will show what I mean by that in a bit. But to explain why I think a criteria API is such a great thing I will show the alternatives first.

Lets take a common search form scenario where a user can specify a number of customer filtering options like date added, name, the customer should have an address, the customer should be linked to a specific salesman, etc. If you were forced to use a stored procedure how would you implement this query?

I have seen a couple of solutions to problems like this, here is one that is really bad:

IF @name = ''
BEGIN 
     --- the whole query for this case 
END    
ELSE IF @name <> '' AND @shouldHaveAddress IS NOT NULL 
BEGIN 
    --- the whole query for this case 
END 
ELSE IF @name <> '' AND @shouldHaveAddress IS NOT NULL AND @salesmanNr <> '' 
    --- the whole query for this case 
END

Here the whole query is duplicated for each possible parameter mutation. I have seen a stored procedure like this recently that duplicated a very long query about 10 times (with very small variations in the where and join clauses).

Another solution is to complicate the query with embedded IFs, CASE and additional OR statements. But this solution does not only make the query unintelligible but also I think has some limitations. The solution that I see many arrive at is to build the query using string concatenation.

SET @sqlSelect = 'SELECT distinct ' + @NEWLINE + ' ' + @returnColumns + @NEWLINE
SET @sqlFrom = ' FROM Customers cust' + @NEWLINE
IF @companies is NOT NULL
BEGIN
  SET @sqlFROM = @sqlFROM + ' JOIN Companies comp on comp.Id=cust.CompanyId' + @NEWLINE
  SET @sqlFROM = @sqlFROM + ' LEFT JOIN Addresses addr on addr.Id=cust.AddressId ' + @NEWLINE  
END
IF @salesmanId IS NOT NULL
BEGIN
  SET @sqlFROM = @sqlFROM + ' JOIN Salesmen sal on sal.Id=' + @salesmanId 
END
...
..
.

I actually think that building up the query like this is not all that bad, however I would do it in C# and skip the stored procedure. The reason I prefer this approach is that it there is less duplication and because it is somewhat easier to maintain. But it is far from what we want!

So how do can we handle this via the nhibernate query api?

ICriteria query = session.CreateCriteria(typeof(Employee));

if (searchOptions.FirstName != null)
{
  query.Add(Expression.Eq("FirstName", searchOptions.FirstName));
}

if (!searchOptions.LastName != null)
{
  query.Add(Expression.Eq("LastName", searchOptions.LastName));
}

if (searchOptions.PhoneNumber != null)
{
  query.CreateCriteria("PhoneNumbers")
    .Add(Expression.Like("Number", searchOptions.PhoneNumber + "%"));
}
return query.List<Employee>();

The nhibernate criteria api might be verbose for simple scenarios compared to a SQL query in a string, but it is scenarios like the above that it really shines :) But wait it can be better. If you use Ayende's great NHibernate Query Generator (NHQ) you can do this:

QueryBuilder<Employee> query = new QueryBuilder<Employee>();

if (options.FirstName != null)
{
  query &= Where.Employee.FirstName == options.FirstName;
}

if (options.LastName != null)
{
  query &= Where.Employee.LastName == options.LastName;
}

if (options.PhoneNumber != null)
{
  query &= Where.Employee.PhoneNumbers.Number.Like(options.PhoneNumber, MatchMode.Start);
}            

return Repository<Employee>.FindAll(query, OrderBy.Employee.LastName.Desc);

NHQ is a very clever code-gen util that you setup as a post-build step. It generates the Where/QueryBuilder classes from the nhibernate mapping files. The result is that you do not need any strings in your queries! More commonly you use it like this:

return Repository<User>.FindAll(Where.User.Name == name);

I think it comes very close to using LINQ. So to end this post I guess I need to show how the above query would look with LINQ, because LINQ to SQL handles the above scenario pretty well.

IQueryable<Employee> query = linqContext.Employees;

if (options.FirstName != null)
{
  query = query.Where(emp => emp.FirstName == options.FirstName);
}

if (options.LastName != null)
{
  query = employees.Where(emp => emp.LastName == options.LastName);
}

if (options.PhoneNumber != null)
{
  query = from emp in query
	  from phoneNr in emp.PhoneNumbers
	  where phoneNr.Number.StartsWith(options.PhoneNumber)
	  select emp;
}            

return employees.ToList();

The above code is nice, but I think querying across relations is handled nicer in the nhibernate critiera api especially when using NHQ. Well that is all, now you know why I think a criteria API is are one of the mayor reasons to use an O/R mapper.