Deep dive: Linked entities in FetchXML
The first section explains traditional FetchXML link types, i.e., inner and outer joins. It introduces the concept of reverse lookups (as those representing 1:N relations), emphasizing their crucial role in fetch performance. Additionally, it explores how the system's security model affects fetch performance and addresses specifics of the synchronization sync filters.
The second section explores the applicability of In/Exists links:
- These links can replace inner links when the linked entity’s attributes are neither referenced outside the <link-entity> nor included in the fetch results.
- They are effective when the performance of an SQL sub-query surpasses that of a join.
- This is particularly relevant for highly selective fetches (those examining large datasets to yield a small number of results) and their inner links based on reverse lookups.
The key takeaway is that while In/Exists links may not generally improve fetch performance (and can sometimes degrade it), there are specific contexts — particularly in complex Field Service queries — where they can offer significant performance improvements and help avoid potential server timeouts.
A typical example is msdyn_workorder<-bookableresourcebooking link in the MyBookings-centric Field Service client data model.
Linked entities in FetchXML
Links allow you to join related entities (tables) in a single query. For example, you might want to get information for an account which includes contacts related to that account. (We'll discuss this example later.)
Formally, links combine two entities - BaseEntity with LinkedEntity, whereby the latter is described by XML syntax <link-entity name="LinkedEntity" from="fieldFrom" to="fieldTo" alias="..." link-type="...">
In the example
- LinkedEntity.fieldFrom should match BaseEntity.fieldTo.
This typically means a lookup BaseEntity.fieldTo pointing to LinkedEntity primary key (fieldFrom), but there are more possibilities. (See below.) - Alias defines an (optional) new name for the linked entity.
- Link-type specifies various ways of joins - traditionally inner (default) and outer.
Besides this, the linked entity may contain attributes and conditions just like the base entity.
Link examples
This fetch outputs a list of contacts (represented by their full name):
<fetch>
<entity name="contact">
<attribute name="fullname"/>
</entity>
</fetch>
If we want to add information about the parent account, we can make use of the lookup contact.parentcustomerid -> account.
For example, the following fetch returns two columns from two different entities: (contact.fullname, account.name).
<fetch>
<entity name="contact">
<attribute name="fullname"/>
<link-entity name='account' from='accountid' to='parentcustomerid' link-type='inner'>
<attribute name="name"/>
</link-entity>
</entity>
</fetch>
The link is inner, which means that standalone contacts (those without a parent account) are not returned. If we also want those contacts, we can use the outer link:
<fetch>
<entity name="contact">
<attribute name="fullname"/>
<link-entity name='account' from='accountid' to='parentcustomerid' link-type='outer'>
<attribute name="name"/>
</link-entity>
</entity>
</fetch>
This time, we get:
- previous results (contact.fullname, account.name)
- new results (contact.fullname, null) representing standalone (disconnected, unassociated) contacts.
The above cases are lookup-based, i.e., they use existing relation contact->account. (Relational theory uses the term N:1 relation contact:account as 1 account has N contacts.)
However, we may pose a reverse problem: Get accounts + their contacts. We have to use the same relation as above, but in reverse order: account:1 <- contact:N. We refer to this situation as a reverse lookup. We'll subsequently discuss the two possible fetches - inner and outer.
<fetch>
<entity name="account">
<attribute name="name"/>
<link-entity name='contact' from='parentcustomerid' to='accountid' link-type='inner'>
<attribute name="fullname"/>
</link-entity>
</entity>
</fetch>
The results look like this:
(account1.name, contact11.fullname)
(account1.name, contact12.fullname)
(account1.name, contact13.fullname)
...
(account2.name, contact21.fullname)
(account2.name, contact22.fullname)
...
We see that one account is returned N times - once per each account's contact. Note that accounts not having any contact at all are not returned. If we want those as well, we have to use the outer link:
<fetch>
<entity name="account">
<attribute name="name"/>
<link-entity name='contact' from='parentcustomerid' to='accountid' link-type='outer'>
<attribute name="name"/>
</link-entity>
</entity>
</fetch>
Then we also get "orphaned" accounts (accounts with no contacts):
(account3.name, null)
Let's show another example demonstrating a bit unorthodox use of outer link ("accounts without contacts"):
<fetch>
<entity name='account'>
<attribute name='name' />
<link-entity name='contact' from='parentcustomerid' to='accountid' link-type='outer' alias='C'/>
<filter type='and'>
<condition entityname='C' attribute='parentcustomerid' operator='null'/>
</filter>
</entity>
</fetch>
Links over non-lookup fields
Tables are typically linked over lookup fields (i.e., using existing relations), but this is just a convention. Links over non-lookup fields (fields not taking part in any defined relationship) are possible as well, provided some "natural" conditions are met:
- Joined fields should have the same type. (Otherwise, type conversion may have a performance impact.)
- These column types cannot be used in from/to attributes: File, Image, MultiSelect Field, PartyList. (Reason: specific column implementation.)
- Some columns have poor performance: Long texts, calculated fields, and logical (external) columns.
Here is a bit artificial example:
<fetch>
<entity name='account'>
<attribute name='name' />
<attribute name='websiteurl' />
<link-entity name='contact' from='fullname' to='name' link-type='inner'>
<attribute name='jobtitle' />
</link-entity>
</entity>
</fetch>
One remark about performance: Unlike lookups (which are indexed on Dataverse), evaluating these criteria requires a table scan and might cause performance issues—at least on large tables.
Fetch results
In the case of simple fetches (not containing links), the results are base entity rows matching fetch conditions. Each row is represented by the values of requested attributes, e.g.:
<fetch>
<entity name='account'>
<attribute name='name' />
<attribute name='websiteurl' />
<filter type="and">
<condition attribute="statuscode" operator="ne" value="2"/>
</filter>
</entity>
</fetch>
The ID column (primary key) of the base entity is always returned - even if not requested.
If the fetch contains links, the fetch returns a cartesian product of
- Base entity rows (those matching the fetch)
- Linked entity rows (those matching base entity row)
Again, the linked entity columns (attributes) returned are those specified in the fetch. Notice that the linked entity's ID is not added automatically.
Reverse links
If the linked entity is connected over base entity lookup (i.e., LinkEntity.From points to the Primary Key), we have a 1:1 link, and the linked entity can be considered an extension of the base entity. Results always differ in this case (because of included base entity ID).
If, however, LinkEntity.From points to a field that is not a Primary Key, we have a 1:N link, and the results may have duplicates—depending on which attributes of the linked entity are requested. (Although they ALWAYS differ on the server due to the ID of the linked row.)
A typical example is a parent entity linking child records, such as
<fetch>
<attribute name="name/>"
<entity name="queue">
<link-entity name="queueitem" from="queueid" to="queueid"/>
</entity>
</fetch>
The above fetch would return each queue record multiple times, once for each of its associated queueitem child records. To prevent returning equal results, you may use the fetch with the DISTINCT keyword: <fetch DISTINCT='true'>.... Fetching distinct records has the following impacts:
- Decreased web transfer
- Decreased client work
- Increased (sometimes substantially) server work
Sync Filter specifics
Sync Filter represents the conditions that need to be matched for given server records to be downloaded to the client. It is expressed as a fetch. If the sync filter is undefined, then the default filter is used: "Get all records".
Sync Filter does not have attributes. Resco Mobile CRM automatically supplies all attributes of the base entity as defined in the client customization. The ID attribute assures that two base entity records always differ.
Sync Filter may have linked entities. Their attributes are never returned. Hence, links serve only for match testing (filtering).
If the sync filter contains a reverse link, the SyncEngine automatically generates a DISTINCT fetch to avoid potential duplicates. This can negatively affect performance. (Reverse link is a link based on 1:N relation, i.e., on a reverse lookup.)