Are you using map, forEach, and reduce wrong?

In JavaScript, Array.prototype.map, Array.prototype.forEach, and Array.prototype.reduce are used heavily in functional-style programming. However, I meet many developers missing a clear mental model of when or how to use each function.

This is a problem because we use these functions not just for their behaviour but to communicate our code's intent. These functions reveal crucial details about the author's mental model of the code, the problem, and the solution.

In this article, I will attempt to couple a technical understanding of these functions to a semantic understanding in your mind. As a result, you should have a mental model of when to use these functions and why, plus what it means when you read code using them.

Array.prototype.map

Let's start with a simplified version of the official TypeScript definition for map and break it down. I have trimmed the type parameters down to the simplest and most common use case:

map<U>(
  callbackfn: (value: T) => U
) => U[];

Above, we have a map function that accepts a type argument U, a callback function, and returns an array (U[]).

The callback function takes a value of type T and returns a U. So it accepts a function that will take a value and transform it into another value.

Map calls this transformation function, the callback, for every element (T) in the array. The callback function transforms T into U, which map uses to transform each element, returning a new array of U[].

Now I have lied to you slightly in the above explanation. Transforming implies a mutative operation. However, map does not and should not mutate anything. Instead, it returns an entirely new array (U[]) where every element has a 1 to 1 relationship with each element of T[]. We call the function map because it maps one type onto another. However, thinking of it as a transformation helps us think about when to use it.

Use .map to "transform" an array of one thing into an array of something else.

I will rewrite the above type definition again in a slightly different format, using intention revealing names for the type parameters. Then we will look at some real-world examples.

Array<InputType>.map<OutputType>(
  callbackfn: (value: InputType) => OutputType
) => Array<OutputType>;

This is an identical type definition to the first one I showed you. I've included it because I know that seeing it like this will help some people grok it.

In the next scenario, we have an array of Person objects, and we want to map these into an array of ReactNodes:

type Person = {
  firstName: string;
  lastName: string;
  age: number;
};

type Props = { people: Person[] };

export const PeopleList: React.FC<Props> = ({people}) => (
  <ul>
    {people.map(
      (person) => (
        <li>{person.firstName} {person.lastName} | {person.age}</li>
      )
    )}
  </ul>
);

In this example, our map function transforms each Person object from the people array into a list item ReactNode. It's a simple use case and tightly coupled to the context we are using it -- the anonymous function we define for the map function isn't beneficial outside of the PeopleList component since it always returns a <li/> node.

Let's consider a use case where we might reuse our callback function in other contexts. In our next example, we are going to map an array of People into an array of full names:

type Person = {
  firstName: string;
  lastName: string;
  age: number;
};

function getFullnameFromPerson(person: Person): string {
  return `${person.firstName} ${person.lastName}`;
}

function listPeople(people: Person): string {
  return people.map(getFullnameFromPerson).join(', ');
}

In the above example listPeople passes the getFullnameFromPerson function to map, but we could conceivably use getFullnameFromPerson in other contexts too.

Let's try a more complicated example now, where we want to create a layer between our database implementation and our TypeScript implementation. There are several things map can help us with:

In Postgres, we use snake_case column names, but in TypeScript and JavaScript, we use camelCase. We'll want to map between these types
Enum types often come from another table; we'll map these too
We'll also map our 64bit integer ids from strings to BigInt

Javascript Types:

type ArticleType = 'GENERAL' | 'REVIEW' | 'EDITORIAL';

type Article = {
  id: BigInt;
  articleType: ArticleType;
  author: string;
  title: string;
  urlSlug: string;
  content: string;
  publishedAt: Date;
}

Database Types:

type ArticleRow = {
  id: string; // Bigints come back from pg-node as a string
  article_type_id: number;
  author: string;
  title: string;
  url_slug: string;
  content: string;
  published_at: Date;
}

type ArticleTypeRow = {
  id: number; // small int
  type: string;
}

Mapping:

Now we have our cast of types, let's consider how we might map from these rather rough database types to these very usable javascript types. I'll write a function that selects the ten most recent articles, and we will use a series of maps to transform the query results.

const getRecentArticles = async (): Promise<Article[]> => {
  const result = await pool.query<ArticleRow>(`
    SELECT * FROM articles ORDER BY published_at LIMIT 10;
  `);
  return Promise.all(result.rows.map(getArticleFromArticleRow));
}

const getArticleFromArticleRow = async (row: ArticleRow): Promise<Article> => ({
  id: BigInt(row.id),
  articleType: await getArticleTypeById(row.article_type_id),
  author: row.author,
  title: row.title,
  urlSlug: row.url_slug,
  content: row.content,
  publishedAt: row.published_at,
});

/**
  This function memoises lookups to article type.
*/
const getArticleTypeById = (() => {
  let cache: Map<string, ArticleType>;
  return async (id: number): Promise<ArticleType> => {
    if (!cache) {
      const result = await pool.query<ArticleTypeRow>(`
        SELECT * FROM article_types
      `);
      cache = result.rows.reduce(
        (current, previous) => previous.set(current.id, current.type),
        new Map()
        // This gives us an object where we can lookup article types by id
      );
    }
    return cache.get(id);
})();

The first function in the above example, getRecentArticles, simply returns the ten most recently published articles.

If you're wondering why we use Promise.all in the return statement, it is because our mapping callback function getArticleFromArticleRow returns a Promise. That means that the result of result.rows.map(getArticleFromArticleRow) is an array of promises. Still, the caller of getRecentArticles expects a single Promise containing an array of Articles.

Promise.all takes an array of promises and maps them to an array of resolved values within a single promise, thus fulfilling our contract to the caller of getRecentArticles.

Why did we need getArticleFromArticleRow to be asynchronous (return a promise)? Because we look up the article type from the article type enum table in Postgres. You are correct in thinking that a Postgres query for every row in getRecentArticles is wasteful and slow. That is why getArticleTypeById actually caches the value locally.

Looking at the implementation, we use an immediately invoked function expression to create a cache value protected in the closure of the function definition for getArticleTypeById. This closure means no other part of the application can access cache, but all calls to getArticleTypeById utilise the same cache. It is effectively a functional singleton, providing a memoised lookup, so we only have to query the database for ArticleTypes once per node process.

As you can see, Array.prototype.map is a powerfully simple concept. You can build complex transformations with it, create asynchronous processes, build recursive solutions, and even chunk and distribute a map across multiple processes because it is monadic.

Now that we understand map let's see why forEach is different.

Array.prototype.forEach

Let's start with a simplified TypeScript definition of forEach like we did with map:

forEach(
  callbackfn: (value: T) => void
): void;

This looks pretty similar to map, except that the callback function doesn't return anything, and neither does forEach itself. Why might we want a function that returns nothing for each element of an array? Because we want to create side effects.

Use forEach for creating side-effects beyond the local scope

Let's understand side effects quickly by thinking about its opposite -- a pure function. Pure Functions always return the same value given the same inputs. The synchronous map functions in the previous example were pure, but the asynchronous functions accessing a database were impure because the return value could change between calls.

A function can also have a side effect when it affects part of the system beyond its input or output values. Let's move on to a concrete example.

If we wanted to send an email to every user in an array of users, we would use the forEach function if we do not care about the aggregate outcome of the email sending function. I'll give you an example, where our handleMeetingBooked function will email all attendees:

type Meeting = {
  name: string;
  attendees: User[];
  date: Date;
}

type User = {
  email: string;
  name: string;
}

const handleMeetingBooked = (meeting: Meeting): void => {
  const { attendees } = meeting;
  attendees.forEach((attendee) => {
    sendMeetingEmail(attendee, meeting);
  });
}

const sendMeetingEmail = (attendee: User, meeting: Meeting): void => {
  const emailBody = `Hi ${attendee.name},
    you have been invited to ${meeting.name} 
    on ${meeting.date} 
    with ${meeting.attendees.map(({name}) => name).join(', ')}`;
    sendEmail({ to: attendee.email, body: emailBody });
}

Notice how no part of handleMeetingBooked cares about what happens inside the callback function provided to attendees.forEach. We choose forEach instead of map because the behaviour is different and because it tells the readers of our code that the application shouldn't care about or depend upon the outcome of this email sending function.

Could I implement identical behaviour with map? Absolutely. Should I? Absolutely not.

Let's look at a typical forEach code smell -- mutating an external collection from within the callback provided to forEach.

// BAD CODE, DO NOT DO THIS
const getUsersNames = (users: User[]): string[] => {
  const names = [];
  users.forEach((user) => {
    names.push(user.name);
  });
  return names;
}

What's wrong with the above code? Aside from a few unnecessary lines, it lies to the reader. It tells the person reading this code that the callback function won't change anything within the function, and then it mutates one of the function's variables! It mightn't seem so bad in a small function like this, but this is a recipe for bugs in larger functions.

Code smell: Using .push inside forEach. Consider alternative e.g. .map or .filter

The correct implementation of the above would simply be:

const getUsersNames = (users: User[]): string[] => 
  users.map(({name}) => name);

Now that we have understood the difference between map and forEach, it is time to consider the role reduce plays in our code.

Array.prototype.reduce

Reduce is the array function that seems to cause the most confusion. Rather than starting with the type definition, I will offer you my mental model of what reduce is for, and then we can look at the code and see how it supports this way of thinking.

Use reduce to reduce a collection of elements to a single aggregate entity.

That is to say; I use reduce when I have many (an array) and want one (anything but an array). A typical example of this is summing the numbers in an array.

Let's take a look at a simplified version of the official type definition as we did before, this time keeping the goal of reducing down to a single aggregate entity in mind:

reduce<U>(
  callbackfn: (previousValue: U, currentValue: T) => U, 
  initialValue: U
): U;

The first thing to notice is that reduce returns a single U where map returned U[]. We can also see that we can supply an initialValue, but the type of this value must match the return type of the reduce function and that the callback function must also return this same type.

I'll write this again with intention revealing names:

Array<InputType>.reduce<OutputType>(
  callbackfn: (previousValue: OutputType, currentValue: InputType) => OutputType, 
  initialValue: OutputType
): OutputType;

We'll reconsider the summation example, equipped with our new mental model and expanded type definition fresh in our minds:

const total = [1, 2, 3, 4, 5].reduce(
  (previous, current) => previous + current,
  0
);
console.log(total); // 15

Above, we started with a collection of numbers and wound up with just one number. We reduced it down to its aggregate: the variable total with a value of 15.

How else could we use this? Perhaps we want to reduce an array of users down to a count of common names:

type User = {
  firstName: string;
  lastName: string;
}

type CommonNamesAggregate = Record<string, number>;

const countCommonNames = (users: User[]): CommonNamesAggregate => {
  return users.reduce<CommonNamesAggregate>(
    (aggregate, user) => {
      const currentCount = aggregate[user.firstName];
      if (currentCount) {
        aggregate[user.firstName]++;
      } else {
        aggregate[user.firstName] = 1;
      }
      return aggregate;
    },
    {} // We create our new CommonNamesAggregate here
  );

In this case, our new aggregate entity is an object where each key is a first name from users, and the value is a count of the frequency of that name in the users array.

Let's write another flavour of the same countCommonNames solution and see if it helps us grok reduce:

const countCommonNames = (users: User[]): CommonNamesAggregate => {
  const uniqueNames = new Set(users.map(({firstName}) => firstName));
  const commonNamesAggregate = [...uniqueNames].reduce(
    (obj, name) => {
      obj[name] = 0;
      return obj;
    },
    {}
  );

  return users.reduce<CommonNamesAggregate>(
    (aggregate, user) => {
      aggregate[user.firstName]++;
      return aggregate;
    },
    commonNamesAggregate
  );

In this implementation, we first create the aggregate object separately and initialise its values to 0 by reducing our set of unique names, saving us from checking if the name already exists in our final reducer.

Now let's revisit forEach regarding the above code example. I'll implement the same behaviour using forEach, and then explain why it is a code smell.

// BAD CODE, DO NOT DO THIS
const countCommonNames = (users: User[]): CommonNamesAggregate => {
  const uniqueNames = new Set(users.map(({firstName}) => firstName));
  const commonNamesAggregate = [...uniqueNames].reduce(
    (obj, name) => {
      obj[name] = 0;
      return obj;
    },
    {}
  );

  users.forEach(
    (user) => commonNamesAggregate[user.firstName]++
  );

  return commonNamesAggregate;
};

This might look simpler because it has fewer lines of code, but it gives the reader less information about the intent of the code. By using forEach, we tell the reader to hold onto their hats and read carefully because we're about to do something. The issue is that the only clues about what that something is, come from code that is mainly outside and before the forEach callback.

Code smell: Mutating a local variable from within a forEach.

Ultimately, forEach is the array function with the least semantic meaning and the least ability to reveal intentions. As such, forEach should be used as a last resort when there isn't a more suitable function or when you explicitly want to indicate that what occurs within the callback is not relevant to the adjacent code.

I also want to talk briefly about a code smell in reduce usage: returning an array from reduce.

Code smell: Returning an array from reduce. Consider map or filter instead.

People usually do this by mistake when they want to chain .filter and .map together, but instead, they do it in one function. The damage from this anti-pattern is that we mislead readers of our code entirely.

Conclusion

As programmers, we spend a lot of time working with collections of things, whether arrays, sets, maps, or objects. JavaScript comes with a fantastic suite of array functions that can make our days easier and our code more readable. Hopefully, this article has given you the confidence to know which array function to use and why.

TLDR

Use .map to "transform" an array of one thing into an array of something else
Code smell: Mutating a local variable from within a map.
Use forEach for creating side-effects beyond the local scope
Code smell: Using .push inside forEach. Consider alternative e.g. .map or .filter
Code smell: Mutating a local variable from within a forEach.
Use reduce to reduce a collection of elements to a single aggregate entity.
Code smell: Returning an array from reduce. Consider map and/or filter instead.

Are you using map, forEach, and reduce wrong?

Array.prototype.map

Array.prototype.forEach

Array.prototype.reduce

Conclusion

TLDR

Comments (2)

More from this blog

I Will Never Use AI to Code (or write)

Coupling

Reliable HTTP: Outsmarting the Two Generals with Webhooks

The Fundamental Problems of Software

The Four Quadrants of Complexity

Command Palette

Array.prototype.map

Array.prototype.forEach

Array.prototype.reduce

Conclusion

TLDR

Comments (2)

More from this blog