Grouping

Sometimes the source document is a flattened, denormalized, document, while the target document is a structured, normalized, document. This occurs for example if the source document comes from a fixed-length or delimited file as picked up by the file connection point. In that case, you can use the ‘group by’ functionality.

The example shows that without grouping, the order header is repeated in the target document like in the source document. When using grouping, the two lines for order SO1 are combined in a single SalesOrder document having a single header.

This is the code of the source document:

<MyOrder>
	<OrderNr>SO1</OrderNr>
	<Status>Open</Status>
	<LineNr>1</LineNr>
	<Item>ITM11</Item>
	<Quantity>110</Quantity>
</MyOrder>
<MyOrder>
	<OrderNr>SO1</OrderNr>
	<Status>Open</Status>
	<LineNr>2</LineNr>
	<Item>ITM12</Item>
	<Quantity>120</Quantity>
</MyOrder>
<MyOrder>
	<OrderNr>SO2</OrderNr>
	<Status>Approved</Status>
	<LineNr>1</LineNr>
	<Item>ITM21</Item>
	<Quantity>210</Quantity>
</MyOrder>

This is the code of the resulting document if grouping is not used:

<SalesOrder>
	<SalesOrderHeader>
		<DocumentID>
			<ID>SO1</ID>
		</DocumentID>
		<Status>
			<Code>Open</Code>
		</Status>
	</SalesOrderHeader>
	<SalesOrderLine>
		<LineNumber>11</LineNumber>
		<Item>
			<ItemID>
				<ID>ITM11</ID>
			</ItemID>
		</Item>
		<Quantity>110</Quantity>
	</SalesOrderLine>
</SalesOrder>
<SalesOrder>
	<SalesOrderHeader>
		<DocumentID>
			<ID>SO1</ID>
		</DocumentID>
		<Status>
			<Code>Open</Code>
		</Status>
	</SalesOrderHeader>
	<SalesOrderLine>
		<LineNumber>12</LineNumber>
		<Item>
			<ItemID>
				<ID>ITM12</ID>
			</ItemID>
		</Item>
		<Quantity>120</Quantity>
	</SalesOrderLine>
</SalesOrder>
<SalesOrder>
	<SalesOrderHeader>
		<DocumentID>
			<ID>SO2</ID>
		</DocumentID>
		<Status>
			<Code>Approved</Code>
		</Status>
	</SalesOrderHeader>
	<SalesOrderLine>
		<LineNumber>21</LineNumber>
		<Item>
			<ItemID>
				<ID>ITM21</ID>
			</ItemID>
		</Item>
		<Quantity>210</Quantity>
	</SalesOrderLine>
</SalesOrder>

This is the code of the resulting document if grouping is used:

<SalesOrder>
	<SalesOrderHeader>
		<DocumentID>
			<ID>SO1</ID>
		</DocumentID>
		<Status>
			<Code>Open</Code>
		</Status>
	</SalesOrderHeader>
	<SalesOrderLine>
		<LineNumber>11</LineNumber>
		<Item>
			<ItemID>
				<ID>ITM11</ID>
			</ItemID>
		</Item>
		<Quantity>110</Quantity>
	</SalesOrderLine>
	<SalesOrderLine>
		<LineNumber>12</LineNumber>
		<Item>
			<ItemID>
				<ID>ITM12</ID>
			</ItemID>
		</Item>
		<Quantity>120</Quantity>
	</SalesOrderLine>
</SalesOrder>
<SalesOrder>
	<SalesOrderHeader>
		<DocumentID>
			<ID>SO2</ID>
		</DocumentID>
		<Status>
			<Code>Approved</Code>
		</Status>
	</SalesOrderHeader>
	<SalesOrderLine>
		<LineNumber>21</LineNumber>
		<Item>
			<ItemID>
				<ID>ITM21</ID>
			</ItemID>
		</Item>
		<Quantity>210</Quantity>
	</SalesOrderLine>
</SalesOrder>

To switch on grouping:

  1. In the source tree, right-click the element that contains the data to be grouped. In the example above, that would be the MyOrder element.
  2. Select Add group by. The element gets a visual indication that the grouping is modeled.

To switch off grouping:

  1. In the source tree, right-click the element that has the grouping indicator.
  2. Select Remove group by.

The grouping works as follows:

These child elements of the ‘group by’ element, MyOrder, are used for the grouping child elements that are connected to:

  • Direct children of the corresponding element in the target document, SalesOrder.
  • Children of non-repeating nodes such as SalesOrderHeader inside that target element.

In the example that would be OrderNr and Status.

If, at runtime, the original XML document has multiple MyOrder instances that have the same value for all elements used in grouping. These elements result in a single SalesOrder in the resulting XML document. For example, if the OrderNr is the same, but the Status differs, it results in two SalesOrder nodes.

The child elements of the ‘group by’ element, MyOrder. That are connected to a child of a repeating node, such as SalesOrderLine, inside the corresponding target element, SalesOrder. Are not used for the grouping.