Tech Questions #4: Should I use ToArray/ToList in LINQ queries?

0

Hi,

Quick word on LINQ

So today’s post is going to be focused around LINQ which is an acronym for Language INtegrated Query. For those who are unaware and would be interested in learning more about LINQ, I invite you to go on this page. In a few words, LINQ is the name for a set of technologies based on the integration of query capabilities directly into the C# language.

In the past, I have had some questions and I’m leveraging my blog to share my experience on the topic and hope to make you better developers by using the experience that I’m sharing. So, with no further ado, let’s dive into the tech question of the day!

#4: Should I use ToArray() / ToList() in LINQ queries?

So, when we use the LINQ, every extension method returns an IEnumerable and they can be chained together to let you, the developer, do whatever it is that you have to do, like say finding the name of author in the database which sold over 100 books in the year 2018 for instance. The power of LINQ can found in its deferred expressions. What’s a deferred expression? Good question! When you’re writing your LINQ expression such as

Your expression doesn’t go through the entire collection until told otherwise. What happens behind the scene is that until forced to moved the enumerator to the next index, LINQ is going to wait. What good does that bring you might ask? It also means that you won’t allocate useful memory until there’s no choice but to allocate it and do what it is that you have to do with it.

Back to Should I use ToArray/ToList in LINQ queries. The best answer that I can give you is that it depends on the context, but I frown upon the usage when deemed unnecessary. Why do I frown upon that? Because it allocates memory for your entire collection that will be generated by your LINQ expression. Using those methods have a performance hit on your entire expression of a size of O(n). Depending on the context, if I use one of those endpoints in the LINQ API, I’d use ToArray over ToList for purposes such as iterating through the resulting collection. ToArray has a bit less of a performance impact on your code. When your collection is too big, it must grow behind the scene an array of a size big enough to contain all your data and then returns said array. In the case of ToList, again, we’re growing an array, but once the array is set, we’re initializing a list object with the resulting array which means that you’re allocating X.Y Gb memory twice from your query.

As I mentioned earlier, it all depends on your context. Maybe after having your resulting collection, you’ll be in the need of a list for a specific endeavor. When I’m using LINQ, most of the time, I’m making sure that my API endpoints return IEnumerable<T> to let the users of my code do whatever they want with them. If you need to create a dictionary or a set from my method, the returning value lets you do that since IEnumerable is the common interface that collections point to (usually).

Thank you for your time,

Kevin ūüôā

Advertisements

Tech questions 1-3: Linq

0

Hey guys,

This is a new series I will try to maintain to the best of my capabilities. I’m this awesome blogger who happens to be also a Microsoft MVP called Iris Classon. After her first year of programming, she started to ask and get answers for what she’d call “stupid question”. Why would she consider them as stupid ? Well actually they’re not. They’re basically good questions that any developer, being a junior or an architect ( well less likely if you’re an architect), to ask and get answers too. Her series is really good and it got me thinking that I should start my own too. This begs the question to the why I would feel the need to do something like this for my blog ? We all want to get better in our field of expertise. Software engineering being extremely broad, it can¬†get a “bit” confusing sometimes. My tech questions will lead me, I hope, to better myself and have not only a better understanding of the .NET framework, but any kind of technology that interest me right now or will interest me in a near/far future. I hope those questions will be helpful for the readers and followers of the blog.

To start the series, I will start with questions I have had in the past. Why ? It’s going to help me to stay on track and deliver answers for the tech questions on a regular basis. So let’s get cracking with LINQ. Why should we use LINQ in our C# solutions ? That’s quite a question. I cannot simply answer by the “well because it’s better that way.” and be done with it.

1. What is LINQ ?

The acronym LINQ stands for Language INtegrated Query. It allows .NET developers to query IEnumerable implementation to retrieve data as you would do in a SQL database. For instance, a list of int, which is a data structure implementing the IList interface which extends both ICollection and IEnumerable interface, ¬†can be queried that list to see what’s the average of the values it contains. To be able to use LINQ, you’re going to add System.Linq directive in your source file (or have Resharper tell you exactly what references are missing from your code).

using System;
using System.Linq;
//Our very first linq call ! ūüôā
public class LinqSamples
{
    public int AverageInts(List<int> list)
    {
        return list == null || list.Count == 0 ? 0 : list.Average();
    }
}

2. What are some benefits of LINQ?

Well there are many. You can bet that when you go LINQ, you won’t go back. ¬†One of the first benefits of LINQ in your code is that makes it more declarative. This means that it can almost be read as plain English (as far as code goes).

using System;
using System.Linq;
public class LinqSamples
{
    //Worst name for a method I agree haha
    public List<string> OrderStringStartingByA_AndByLength(List<string> list)
    {
         return list.Where(str => str != null)
                    .Where(str => str.Contains("a"))
                    .OrderBy(strElement => strElement.Length);
    }
}

So basically, the method goes through the List to first manipulate only not null strings. Then it filters out of the list strings not starting by the letter ‘a’ and finally,¬†orders the list by string length, from the smallest to the biggest string.

It can also reduce the complexity and the length code written using either a for or a ¬† ¬† ¬† foreach loop. In order to write the equivalent of what I’ve just wrote using only foreach loops, ¬†it would end up being something like :

using System;
using System.Linq;
public class LinqSamples
{
    //Worst name for a method I agree haha
    public List<string> OrderByA_AndLength(List<string> list)
    {
        var orderedList = new Lis<string>();
        foreach(var str in list)
        {
           if(str == null) break;
           if(!str.Contains("a")) break;
           orderedList.Add(str);
        }
        orderedList.Sort();
        return orderedList;
    }
    // See here : 9 lines vs 3 lines in the other example!
}

See here, we’ve been only using the type of LINQ called Linq to Object. The architecture was well thought; doing so, it lets pick up quite easily other types of LINQ such as LINQ to XML or LINQ to SQL.

3. What is a method group ?

In a few words, a method group is a set of overload methods resulting from a member lookup. This comes directly from the C# 3.0 Section 7.1.¬†EventHandler handler = MyMethod; then “MyMethod” refers to a method group. There could be multiple methods with the same name, but different signatures. The method group conversion creates a delegate calling the appropriate actual method. In short, a method group is the name of a set method. There might just be one. Using proper conversions, the compiler¬†will

A method group is the name for a set of methods (that might be just one).
The ToString function has many overloads – the method group would be the group consisting of all the different overloads for that function. It is a compiler term for “I know what the method name is, but I don’t know the signature”; it has no existence at runtime, where it is converted in the correct overload. Also, if you are using LINQ, you can apparently do something like myList.Select(methodGroup) so you can replace this code:

A method group is the name for a set of methods (that might be just one) – i.e. in theory the ToStringmethod may have multiple overloads (plus any extension methods): ToString(), ToString(string format), etc – hence ToString by itself is a “method group”. It can usually convert a method group to a (typed) delegate by using overload resolution – but not to a string etc; it doesn’t make sense. Once you add parentheses, again; overload resolution kicks in and you have unambiguously identified a method call.

Hope you have enjoy this!
Kevin