GENERATE

The GENERATE function returns a table formed by the Cartesian product of all the rows of the first table and the rows of the second table after being evaluated in the row context of each of the rows of the first one. Or, in other words, for each of the rows in the first table, a row context is created that is applied to the second table before the join is performed.

Syntax

GENERATE(
    table1,
    table2
)

Parameters
  • table1: First table, whose rows will generate the row context.
  • table2: Second table, whose rows are going to be contextualized before applying the Cartesian product.
Returned value

The GENERATE function returns a table that is the result of performing the Cartesian product between the rows of the first table and the rows of the second table after being contextualized.

Additional Information

Tables must have columns with mismatched names, otherwise the GENERATE function will return an error.

Examples

In this first example we start from the following two tables:

Table 1Table 2

Note that these tables have a common field ("Oficina"), so creating a calculated table using the GENERATE function will return an error:

Result of applying the GENERATE function to both tables

Interestingly, it is possible to use this function to generate an intermediate table on which to apply other functions:

Number of rows in the resulting table

Since, in this exercise, we want to visualize the table resulting from applying the GENERATE function to our two data tables, we are going to generate a new table, Tabla2B , with the same fields as Tabla2 but changing the name of the "Oficina" field, something which we will do using the SELECTCOLUMNS function:

Tabla2B = 
    SELECTCOLUMNS(
        Tabla2;
        "Oficina2", Tabla2[Oficina];
        "Compras", Tabla2[Compras]
    )

Tabla2 con el nombre del campo "Oficina" modificado

Now we could apply the GENERATE function to the tables Tabla1 and Tabla2B. If we do it, the result is the same as we would have when using the CROSSJOIN function:

Result of applying the GENERATE function

In this case, applying the row context generated by each of the rows in the first table has not modified the result of the Cartesian product.

But, if instead of applying the GENERATE function to Tabla2B, we do it to the result of filtering said table considering the fields of Tabla1, the result is quite different. Let's do this filtering with the FILTER function:

Tabla = 
    GENERATE(
        Tabla1,
        FILTER(
            Tabla2B,
            Tabla1[Oficina] = Tabla2B[Oficina2]
        )
    )

Result of applying the GENERATE function considering the row context created

Let's see what happened:

The function has traversed each of the rows in the first table, creating a row context for each of them. Thus, for the first one, the row context is given by, among other things, "Oficina " = "España". Next, this row context has been applied to the second table, which is the result of applying a filter to Tabla2B according to which the value of the Oficina2 field must match the result of the Oficina field of the first table. As a result of applying this row context, the second table has been reduced to those rows in which the Oficina2 field takes the value "España" (a single record in this case). Next, the row being considered from the first table has been multiplied by each and every one of the rows from the second table after applying the context (a single row). The result of this first step is -in this example- a single row with the fields of both tables for the records in which the name of the office is "España".

After evaluating the first row of the first table, the same procedure would be carried out with each and every one of the other rows of this table, generating a row context, applying it to the second table, and multiplying the row of the first table by the resulting ones in the second table.

In this example, where all the field values for the offices are present in both tables, the result is similar to an inner join.

To test this last statement, let's repeat the operation including an office in the first table that does not exist in the second, and one in the second that does not exist in the first:

First tableSecond table

Let's now apply the GENERATE function with the filter seen:

Result of applying the GENERATE function considering the row context created

Indeed, when for a row in the first table there is not at least one other equivalent in the second, the product is an empty set (and no result is returned). And the same happens when a row from the second table never gets to be multiplied by any of the first table because it does not have an equivalent.

Of course, we are not obliged to work with a single condition. Suppose we start from the following tables:

First tableSecond table

After changing the name of the "Oficina" field in the second table, we could apply the GENERATE function as follows:

Tabla = 
    GENERATE(
        Tabla1;
        FILTER(
            Tabla2B;
            Tabla1[Oficina] = Tabla2B[Oficina2] &&
            Tabla1[Año] <= 2018
        )
    )

Result of applying the GENERATE function considering the row context created

In this case we are creating a row context that includes the name of the office of the row being considered in the first table and that the year for said row is less than or equal to 2018. If the result of applying this row context to the second table is an empty table, no row will be generated in the table resulting from applying the GENERATE function.

Category
Statistical
Submitted by admin on Mon, 07/01/2019 - 16:32