SyncStats analyzer: Difference between revisions
No edit summary |
No edit summary |
||
| Line 1: | Line 1: | ||
{{WIP}} | {{WIP}} | ||
'''SyncStats analyzer''' is a tool that analyzes multiple sync logs and extracts 30+ different indicators to identify the most frequent problems and characterize average sync performance. | |||
The tool | {{Note|The tool provides tips and recommendations; however, they are primarily aimed at technical readers. Consider consulting with your technical support provider or Resco support before performing any configuration changes in production.|Warning}} | ||
Logs being analyzed can be taken from several data sources: | Logs being analyzed can be taken from several data sources: | ||
| Line 9: | Line 9: | ||
* Sync logs from any other file. | * Sync logs from any other file. | ||
== Prerequisites == | |||
* [[Resco Mobile CRM]] app, version 16.2 and later | |||
== Using SyncStats analyzer == | |||
# Go to the '''[[Setup]]/Settings''' screen, scroll down to the '''Advanced''' section, and select '''SyncStats Analyzer'''. | |||
# Select which log you want to analyze. | |||
# Optionally, enable '''Include csv legend''' if you want to include column description (documentation) in the generated file. | |||
# Click '''Start'''. | |||
The app creates a CSV file with synchronization statistics. There are two primary approaches for using the collected data: | |||
* Read the summary located at the bottom. | |||
* Import the CSV block into Excel and continue the analysis there. | |||
== Prepare logs for analysis == | |||
{{Note|Use this section if you want to prepare a custom input file with selected sync logs for further analysis.|Tip}} | |||
If you configured sync [[Auditing]], all sync logs are sent to the server table resco_mobileaudit. | |||
If you configured sync | |||
To extract sync logs from this table, you have to use these table columns: | To extract sync logs from this table, you have to use these table columns: | ||
* resco_deviceid: ID of the device which created the log. | |||
* resco_operation: Must be 473220004. (Otherwise, the log record will store unrelated audit information.) | |||
* resco_loggedon: Time when the audit record was created on the client. (Roughly corresponds to the instant when the sync was completed.) | |||
* resco_description: Text containing a single sync log. | |||
Hence we can query sync logs by device, time or any combination of both. | Hence we can query sync logs by device, time, or any combination of both. | ||
Here is a sample fetch | Here is a sample fetch that can be used (in this case, requesting 1 year of logs for all devices): | ||
<syntaxhighlight lang="XML"> | |||
<fetch> | |||
<entity name="resco_mobileaudit"> | |||
<attribute name="resco_description" /> | |||
<attribute name="resco_deviceid" /> | |||
<filter> | |||
<condition attribute="resco_entityname" operator="eq" value="synchronization_finish" /> | |||
<condition attribute="resco_operation" operator="eq" value="473220004" /> | |||
<condition attribute="resco_loggedon" operator="last-x-days" value="365" /> | |||
<!-- <condition attribute="resco_deviceid" operator="eq" value="W1KS11E102W" /> --> | |||
</filter> | |||
<order attribute="createdon" /> | |||
</entity> | |||
</fetch> | |||
</syntaxhighlight> | |||
If you want to get the logs for a particular device only, use resco_deviceid condition (commented out in the sample above). Device IDs are listed in [[Device Control]] Woodford section. | |||
Execute the fetch, extract the resco_description content from every row returned, and concatenate all such strings into a file. (The analyzer can even read raw fetch responses, provided that you replace XML entities < and > by < and > characters.) | |||
You may, of course, use more user-friendly methods, such as | |||
* Dynamics: Open a view of the resco_mobileaudit table, set up Advanced Find (similar to our fetch), then export the resco_description column. | |||
* Special Power App built for that purpose. | |||
== What you need to know about synchronization == | |||
Most users want to answer questions like "How are the FullSync's going? Whare are the problems?". Or a similar question for incremental sync. However, things are more complicated, and the distinction FullSync/IncSync is not always clear. | |||
Most users want to answer | |||
However, things are more complicated and the distinction FullSync/IncSync is not always clear. | |||
For example: | For example: | ||
* Sync was aborted due to an error in an early phase (connect failure, for example) before even deciding whether to perform a full or incremental sync. | |||
* Background sync was terminated because of a forced fullsync. | |||
* If you use "Background Download" then there are actually 2 sync runs: | |||
:* The first (marked as 'RestartSyncInBackground') performs upload and customization update | |||
:* The second run performs actual incremental or full sync. | |||
That's | That's why the analyzer defines 3 sync types: | ||
- FullSync | - FullSync | ||
- IncSync | - IncSync | ||
- SyncStart - This stands for "RestartSyncInBackground", but also for all cases when the sync was aborted | - SyncStart - This stands for "RestartSyncInBackground", but also for all cases when the sync was aborted early. | ||
Each sync log is presented as a set of 30+ performance counters. | Each sync log is presented as a set of 30+ performance counters. Not all counters apply to all use cases (for example, you may not use remote [[conflict resolution]]) - in such cases, the respective counter is zero or empty. | ||
Not all | |||
To interpret the counters you need some understanding of the sync process | To interpret the counters, you need some understanding of the [[Synchronization steps|sync process]] and [[Log_files#Synchronization_log|sync logs]]. | ||
Typical things you may look at first: | Typical things you may look at first: | ||
* SyncResult (column result) | |||
* Errors (column err) | |||
* Download performance (column ms/rec) | |||
* Modules (columns) taking the most time | |||
Error processing is a bit tricky. | Error processing is a bit tricky. There are thousands of errors, and in many cases, even the programmers have problems explaining what happened. The analyzer attempts to report typical errors, while all the rest is reported as "OtherError". If you need more information about the error, you have to trace the details in the log. | ||
There are thousands of errors and in many cases even the programmers have | |||
The | |||
Here are | Here are commonly reported errors: | ||
{| class="wikitable sortable" | |||
| NoLicense | LicenseError || License related errors | |||
|- | |||
| MissingAppProject || App project was not found | |||
|- | |||
| LoginFailed || User not able to log in | |||
|- | |||
| UserAbort || User aborted the sync or OS aborted the sync (...) | |||
|- | |||
| MissingPrivilege || User is not allowed to perform attempted action (such as delete som record) | |||
|- | |||
| WebTimeout || Internet timeout | |||
|- | |||
| WebError || Other web error | |||
|- | |||
| HttpErrorXXX || These errors are well documented in the Internet | |||
|- | |||
| TooManySrvRequests || Due to server throttling | |||
|- | |||
| ServerSqlTimeout || Server could not process the query within allocated time (typically 2 min) | |||
|- | |||
| SoapException || Server found a logical problem (e.g. it could not create a record referencing meanwhile deleted record) | |||
|- | |||
| DbWriteBlocked || Sync could not access the database, because UI action locked it. (Temp error) | |||
|- | |||
| DbLocked || Database can't be accessed because it is locked | |||
|- | |||
| DbError || Database error | |||
|- | |||
| {Create|Update|Delete}Error || Server failed to execute given action. Details in the log. | |||
|- | |||
| BugKeyNotFound, BugXXX... || Typically programmer bugs; report to Resco support | |||
|- | |||
| HRESULT_XXX| || Typically low level error codes reported by the server | |||
|- | |||
| OtherError || Any other error | |||
|} | |||
Some of the above errors may be prefixed by the sync module where the problem happened, typically "SharePoint|Docs|SyncShared". | |||
How is the output organized | == How is the output organized == | ||
First part is the actual csv list - a header and 1 row of data per 1 sync attempt. | First part is the actual csv list - a header and 1 row of data per 1 sync attempt. | ||
Revision as of 08:43, 25 April 2023
| Warning | Work in progress! We are in the process of updating the information on this page. Subject to change. |
SyncStats analyzer is a tool that analyzes multiple sync logs and extracts 30+ different indicators to identify the most frequent problems and characterize average sync performance.
| Warning | The tool provides tips and recommendations; however, they are primarily aimed at technical readers. Consider consulting with your technical support provider or Resco support before performing any configuration changes in production. |
Logs being analyzed can be taken from several data sources:
- Current sync log of your device
- Sync logs stored on the server (provided you configured sync auditing)
- Sync logs from any other file.
Prerequisites
- Resco Mobile CRM app, version 16.2 and later
Using SyncStats analyzer
- Go to the Setup/Settings screen, scroll down to the Advanced section, and select SyncStats Analyzer.
- Select which log you want to analyze.
- Optionally, enable Include csv legend if you want to include column description (documentation) in the generated file.
- Click Start.
The app creates a CSV file with synchronization statistics. There are two primary approaches for using the collected data:
- Read the summary located at the bottom.
- Import the CSV block into Excel and continue the analysis there.
Prepare logs for analysis
| Tip | Use this section if you want to prepare a custom input file with selected sync logs for further analysis. |
If you configured sync Auditing, all sync logs are sent to the server table resco_mobileaudit.
To extract sync logs from this table, you have to use these table columns:
- resco_deviceid: ID of the device which created the log.
- resco_operation: Must be 473220004. (Otherwise, the log record will store unrelated audit information.)
- resco_loggedon: Time when the audit record was created on the client. (Roughly corresponds to the instant when the sync was completed.)
- resco_description: Text containing a single sync log.
Hence we can query sync logs by device, time, or any combination of both.
Here is a sample fetch that can be used (in this case, requesting 1 year of logs for all devices):
<fetch>
<entity name="resco_mobileaudit">
<attribute name="resco_description" />
<attribute name="resco_deviceid" />
<filter>
<condition attribute="resco_entityname" operator="eq" value="synchronization_finish" />
<condition attribute="resco_operation" operator="eq" value="473220004" />
<condition attribute="resco_loggedon" operator="last-x-days" value="365" />
<!-- <condition attribute="resco_deviceid" operator="eq" value="W1KS11E102W" /> -->
</filter>
<order attribute="createdon" />
</entity>
</fetch>
If you want to get the logs for a particular device only, use resco_deviceid condition (commented out in the sample above). Device IDs are listed in Device Control Woodford section.
Execute the fetch, extract the resco_description content from every row returned, and concatenate all such strings into a file. (The analyzer can even read raw fetch responses, provided that you replace XML entities < and > by < and > characters.)
You may, of course, use more user-friendly methods, such as
- Dynamics: Open a view of the resco_mobileaudit table, set up Advanced Find (similar to our fetch), then export the resco_description column.
- Special Power App built for that purpose.
What you need to know about synchronization
Most users want to answer questions like "How are the FullSync's going? Whare are the problems?". Or a similar question for incremental sync. However, things are more complicated, and the distinction FullSync/IncSync is not always clear.
For example:
- Sync was aborted due to an error in an early phase (connect failure, for example) before even deciding whether to perform a full or incremental sync.
- Background sync was terminated because of a forced fullsync.
- If you use "Background Download" then there are actually 2 sync runs:
- The first (marked as 'RestartSyncInBackground') performs upload and customization update
- The second run performs actual incremental or full sync.
That's why the analyzer defines 3 sync types: - FullSync - IncSync - SyncStart - This stands for "RestartSyncInBackground", but also for all cases when the sync was aborted early.
Each sync log is presented as a set of 30+ performance counters. Not all counters apply to all use cases (for example, you may not use remote conflict resolution) - in such cases, the respective counter is zero or empty.
To interpret the counters, you need some understanding of the sync process and sync logs.
Typical things you may look at first:
- SyncResult (column result)
- Errors (column err)
- Download performance (column ms/rec)
- Modules (columns) taking the most time
Error processing is a bit tricky. There are thousands of errors, and in many cases, even the programmers have problems explaining what happened. The analyzer attempts to report typical errors, while all the rest is reported as "OtherError". If you need more information about the error, you have to trace the details in the log.
Here are commonly reported errors:
| LicenseError | License related errors |
| MissingAppProject | App project was not found |
| LoginFailed | User not able to log in |
| UserAbort | User aborted the sync or OS aborted the sync (...) |
| MissingPrivilege | User is not allowed to perform attempted action (such as delete som record) |
| WebTimeout | Internet timeout |
| WebError | Other web error |
| HttpErrorXXX | These errors are well documented in the Internet |
| TooManySrvRequests | Due to server throttling |
| ServerSqlTimeout | Server could not process the query within allocated time (typically 2 min) |
| SoapException | Server found a logical problem (e.g. it could not create a record referencing meanwhile deleted record) |
| DbWriteBlocked | Sync could not access the database, because UI action locked it. (Temp error) |
| DbLocked | Database can't be accessed because it is locked |
| DbError | Database error |
| Update|Delete}Error | Server failed to execute given action. Details in the log. |
| BugKeyNotFound, BugXXX... | Typically programmer bugs; report to Resco support |
| Typically low level error codes reported by the server | |
| OtherError | Any other error |
Some of the above errors may be prefixed by the sync module where the problem happened, typically "SharePoint|Docs|SyncShared".
How is the output organized
First part is the actual csv list - a header and 1 row of data per 1 sync attempt.
Detailed interpretation of all those indicators requires a deep understanding of the sync process. (TODO: Link to wiki) Don't panic, you can do a lot even without it if you look for: - Errors - Abnormal sync termination - Long times - Bad performance (Too high values "ms/rec", abnormal deletes)
Next part is (optional) csv legend containing brief description of all columns.
Final part is analytical - an attempt to evaluate sync performance as a whole. You will find here the most frequent errors plus the evaluation of various sync modules - all of that divided by the sync type.
How to read statistics in the analysis
Let's have this sample (The numbers are real, collected during early tests of a huge installation.):
----- Sync overview -----
FullSync: 16 syncs, out of them AbortedByUser:3, AbortedByWebError:5, AbortedByOtherFatalError:7, 1 (6%) completed
IncSync: 374 syncs, out of them AbortedByUser:17, AbortedByWebError:101, AbortedByOtherFatalError:136, 120 (32%) completed
Partial syncs (will be continued): RestartSyncInBackground * 38
Sync trials failed early (before deciding upon sync type): ConnectError * 17 EarlyWebError * 1
----- FullSync Analysis -----
TotalSyncTime: 16x -> 1312..20339781ms, avg = 5454823ms, median=850453ms
Times of selected sync modules: Connect: 5* <100ms | 11x -> 110..1672ms, avg = 332ms, median=172ms ConflictRes: 4*0ms | 6* <100ms | 6x -> 110..312ms, avg = 158ms, median=141ms CustomizationCheck: 14*0ms | 296ms, 1110ms TotalUpload: 16x -> 125..1422ms, avg = 423ms, median=328ms DownloadStdEntities: 1*0ms | 15x -> 906..20339531ms, avg = 5651585ms, median=1037234ms DownloadNNEntities: 14*0ms | 147266ms, 130094ms
Errors encountered (details have to be traced in the log): WebError * 5 ServerSqlTimeout * 4 SoapException * 2 // Typically refers to the error sent by the server OtherError * 1
1. There were 16 FullSync attempts: 15 of them failed (some reasons enumerated) and just 1 succeeded.
IncSync trials were better, but let's forget about them for the time being and let's skip to FullSync Analysis.
2. 16 FullSync attempts with the duration between 1.3 sec (looks like early abort) and whopping 20339 sec.
Average duration was 5454 sec, median duration 850 sec.
3. Time spent in the server connection: - In 5 cases it was an nearly instant action (taking less than 0.1 sec) - In the remaining case connection took between 0.11 and 1.67 sec. Or 0.33 sec on average.
4. Conflict resolution - Not attempted in 4 cases - 6x fast (< 0.1 sec) - Remaining cases too 0.11-0.31 sec
5. Customization check - Not attempted in 14 cases - The remaining 2 cases took 0.296, resp. 1.11 sec.
6. Other indicators can be interpreted similarly. What remains are the sync errors, this time not only errors causing sync abort, but all errors encountered. For example : - Server timeout happened 4 times. Btw, this error typically aborts the synchronization. - SoapException represents server error response. We don't know where the problem happened, what happened or whether this error caused sync abort. To find out this information, you have to dig into the sync logs. (Find the row where the problem happened and get the sync time, which in turn identifies the sync log.)
CSV legend
This is an optional part of the output file.
LEGEND: project ; Project name (specified in Woodford) projVer ; Project version (specified in Woodford) syncType ; SyncStart | FullSync | IncSync; SyncStart means download was not attempted device ; Device ID; As used by Woodford Device Control date ; When sync terminated result ; Sync result: ; Normal ; Ok - sync completed until end. Errors, if any, did not prevent sync continuation. ; RestartSyncInBackground ; Ok - sync will be switched to background ; CustomizationReady ; New customization ready on the server ; CustomizationDownloaded ; New customization downloaded ; Wiped ; App wiped from Woodford ; AppLocked ; App locked from Woodford ; AdminFullSync ; Woodford requested FullSync, BkSync aborted ; UploadErrors ; Some change(s) failed to be uploaded ; Aborted ; User abort - sync aborted due to user action ; WebError ; Sync abort due to web error ; ConnectFailed ; Can't connect to the server, sync not started ; OAuthError ; Hint to the user: 'Refresh OAuth token from SetupForm' ; Warning ; Sync produced warnings which should be treated as errors in UI ; OtherFatalError ; Fatal (aborting sync) error not reported via above SyncResults errs ; Count of sync errors err ; Types of errors encountered warns ; Count of sync warnings confls ; Count of sync conflicts dur,ms ; [ms] Sync duration conn,ms ; [ms] Connect duration conflRes,ms; [ms] Remote conflict resolution duration conflRes ; Remote conflict resolution details; 'Download failed', 'Resolved:12', 'Failed' - some resolution commands were not accepted cust,ms ; [ms] Customization; Total download time prep,ms ; [ms] Sync preparation phase; Without customization & remote conflict resolution upload,ms ; [ms] Total upload time sent ; # records uploaded rcv ; # records downloaded ms/rec ; Download performance [ms/record]; provided only when enough records were downloaded del's,ms ; [ms] Sync deletes processing duration; (Delete plugin) anal,ms ; [ms] Sync analyzer duration details ; Sync analyzer details; 'First:nnn Full:nnn' - # entities synced for the 1st time, # entities fullsynced due to SyncAnalyzer decision entitys,ms ; [ms] Time spent in standard entities download NN,ms ; [ms] Time spent in NN entities download imgs,ms ; [ms] Time spent in entity images download shared,ms ; [ms] Time spent in shared records download APs,ms ; [ms] Time spent in the download of activityparty records lists,ms ; [ms] Time spent in the download of list entity (marketing lists) recClean,ms; [ms] Time spent in CRM records cleanup docClean,ms; [ms] Time spent in documents cleanup att,ms ; [ms] Time spent in attachments download att ; Attachments download details; 'Async' for async download, 'Recv'= # downloaded att's, 'Failed'=# failed att's, 'Wait'=# att's waiting for next sync, 'Failed' for fatal error files,ms ; [ms] Time spent in document download preparation; Sum for SharePoint + OneDrive + GoogleDrive + Box + DropBox fQueue,ms ; [ms] Time spent in document download fQueue ; Document download details; 'Recv'=# downloaded files, 'Wait'=# files waiting for next sync, 'Quota+' if quota exceeded, 'Failed' for fatal error apiCalls ; # API calls
-->