-
Notifications
You must be signed in to change notification settings - Fork 11
Overview of .NET expression trees
An expanded version of this page appeared in the September 2019 volume of MSDN Magazine, and can currently be seen at the Microsoft documentation site.
An expression is a sequence of one or more operands and zero or more operators that can be evaluated to a single value, object, method, or namespace. (C# Programming Guide)
.NET expression trees are objects that represent various expressions (or statements), and that you can use in code. For example, consider the following expression:
x + 17
This can be broken down into parts:
- addition
- of the value of parameter/variable
x
- and the constant integer
17
- of the value of parameter/variable
You can construct a data structure with the same information using the types in the System.Linq.Expressions namespace:
-
BinaryExpression instance, with a NodeType of Add
- whose Left property points to a ParameterExpression instance, with a Name of
"x"
- and whose Right property points to a ConstantExpression instance, with a Value of
17
- whose Left property points to a ParameterExpression instance, with a Name of
Expressions in code can be made up of other expressions:
(x + 17) * 5
|------| |-| inner expressions
|------------| outer expression
resulting in a tree of expressions, or an expression tree.
The data structure we're describing can also be composed of other such data structures:
-
BinaryExpression instance, with a NodeType of Multiply
- whose Right property points to a ConstantExpression instance with a Value of
5
- and whose Left property contains a BinaryExpression instance with a NodeType of Add
- whose Left property points to a ParameterExpression instance, with a Name of
"x"
- and whose Right property points to a ConstantExpression instance, with a Value of
17
- whose Left property points to a ParameterExpression instance, with a Name of
- whose Right property points to a ConstantExpression instance with a Value of
You can generate expression trees in two ways. The simple way is to have the compiler generate them for you, by using expression lambdas anywhere an Expression<TDelegate>
is expected. For example:
-
Assigning to a variable of type
Expression<TDelegate>
:Expression<Func<int, string>> expr = i => i.ToString();
-
passing in as an argument to a method which expects
Expression<TDelegate>
.
This method of generating expression trees has a few limitations:
- Only instances of
LambdaExpressions
can be created (although theLambdaExpression
can wrap other expressions). - The Lambda expression syntax (both expression lambdas and statement lambdas) have a number of limitations.
-
Expression lambdas -- lambda expressions which only contain an expression, not a block or a statement -- have a number of additional limitations:
- no statements, only expressions (you can't put an
if
block inside an expression lambda) - no
dynamic
- no null-propagation operator
- no statements, only expressions (you can't put an
- The complete structure of the constructed expression tree -- e.g. types of subexpressions and overload resolution -- is determined at compile time
It is also possible to generate expression trees using the factory methods at System.Linq.Expressions.Expression
:
// using static System.Linq.Expressions.Expression
ParameterExpression prm = Parameter(typeof(int), "i");
Expression expr = Call(
prm,
typeof(int).GetMethod("ToString", new Type[] { })
);
// The above is the equivalent of `i.ToString()`, where `i` is of type `int`
Building expression trees in this manner usually involves a fair amount of reflection (see the example above), and multiple calls to the factory methods. However, this opens the door to dynamically writing executable code at runtime, by wrapping with a call to Expression.Lambda
and compiling the resulting expression:
var lambdaExpression = Lambda(
expr,
prm
);
string iToString = lambdaExpression.Compile().DynamicInvoke(17) as string;
// == "17"
Expression trees were originally designed to enable mapping C# or VB.NET code into a different API. The classic example of this is generating an SQL statement:
SELECT * FROM Persons WHERE Persons.LastName LIKE N'D%'
from C# code:
IQueryable<Person> personSource = ... ;
var qry = personQuery.Where(x => x.LastName.StartsWith("D");
or VB.NET code:
Dim personSource As IQueryable(Of Person) = ...
Dim qry = personQuery.Where(Function(x) x.LastName.StartsWith("D))
How does this work? Overload resolution prefers the Queryable.Where
extension method, which takes an expression as the argument; over Enumerable.Where
which only takes a delegate. The compiler converts the lambda syntax to an expression tree; something like the following:
Lambda
Parameters -- Parameter (x, of type Person
)
Body -- Call (StartsWith)
Arguments -- Constant ("D")
Object -- MemberAccess (LastName)
Instance -- Parameter (x)
A LINQ database provider (such as Entity Framework, LINQ2SQL, NHibernate) can take such an expression tree and map the different parts into the WHERE
clause of the above SQL statement:
-
MemberAccess of
LastName
on an instance ofPerson
becomes accessing theLastName
field for a given row in thePersons
table -
Call to the
StartsWith
method with a Constant argument is translated into the SQLLIKE
operator, against a pattern that matches the beginning of a constant string --LIKE N'D%'
.
(More information about this process in Entity Framework can be found here.)
It is thus possible to control external APIs with C# or VB.NET code itself serving as the "language" of the API, as opposed to calling methods or other public members of the API. For example:
-
creating web requests
-
configuring columns in a grid view
-
extract a
MethodInfo
orMemberInfo
from actual code, instead of using reflection:public static MethodInfo GetMethod(Expression<Action> expr) => (expr as MethodCallExpression).Method; // ... // returns the specific overload of WriteLine that takes a double MethodInfo writeLineDouble = GetMethod(() => Console.WriteLine(5.0));
with the added bonus that the compiler can enforce type safety on the parts of the expression tree, and the IDE can show available members using Intellisense.
In .NET 4.0, the expression tree API was extended to allow for statements, not just expressions; allowing a wider range of code structures to be described by expression trees.
For example, consider the following C# code:
var hour = DateTime.Now.Hour;
string msg;
if (hour >= 6 && hour <= 18) {
msg = "Good day";
} else {
msg = "Good night";
}
Console.WriteLine(msg);
We can construct similar same using the expression tree API:
// using static System.Linq.Expressions.Expression
var hour = Variable(typeof(int), "hour");
var msg = Variable(typeof(string), "msg");
var block = Block(
// specify the variables available within the block
new [] { hour, msg},
// hour =
Assign(hour,
// DateTime.Now.Hour
MakeMemberAccess(
MakeMemberAccess(
null,
typeof(DateTime).GetMember("Now").Single()
),
typeof(DateTime).GetMember("Hour").Single()
)
),
// if ( ... ) { ... } else { ... }
IfThenElse(
// ... && ...
AndAlso(
// hour >= 6
GreaterThanOrEqual(
hour,
Constant(6)
),
// hour <= 18
LessThanOrEqual(
hour,
Constant(18)
)
),
// msg = "Good day"
Assign(msg, Constant("Good day")),
// msg = Good night"
Assign(msg, Constant("Good night"))
),
// Console.WriteLine(msg);
Call(
typeof(Console).GetMethod("WriteLine", new [] {typeof(object)}),
msg
)
);
and create a delegate instance out of it:
Expression<Action> expr = Lambda<Action>(block);
Action action = expr.Compile();
which we can then invoke, like any other delegate instance:
action.Invoke();
``