Replication Surveillance

Description

It is essential to monitor the replication to ensure it works properly and does not have inconsistency in the data. Monitoring replication assumes that:

  • files are coming through to/from a site

  • files are being imported/exported

  • errors, if any, are logged to a log file

The Replicator service has an automatic monitoring alerting mechanism. Such a mechanism sends alerts by e-mail.

Here are the messages you can choose to receive/send for monitoring purposes:

  • Status report. It is intended for when the Replicator stops or hangs, and you may not notice it as the application cannot send any alerts. To handle that, you can choose to receive a status report at a defined interval. If you don't receive the status report, there are some issues with the replicator service or the mail transfer.

  • Warning. You can choose to send an alert if you have not received any replication files within a given period. If you have received a status report, but failed to receive any files at a defined period, it may indicate some difficulties at the remote site.

  • Reports on errors. If an error occurs while executing a replication task, it is recommended to send an error report to our dedicated support team.

There are several ways you can use to check the replication status:

  • In Replicator Manager. For more information, see Monitor in Replicator Manager.

  • In Adonis Personnel Manager. For more information, see Monitor in APM.

  • On the main site for all the satellites. For more information, see Monitor on Main Site.

  • On your mobile phone. For more information, see Android Replicator Monitoring.

Monitor in Replicator Manager

To monitor the replication status, first, run Adonis Replicator Manager. Within the application, select the service to which you got connected and navigate to Status > Adonis Replication:

In the Replication Status workspace, click Start to initiate the replication. Once the replication process is triggered, the progress can be viewed on the screen:

Additionally, the advancement of a progress bar is directly proportional to the amount of work that has been completed.

Right above the progress bar, you can see the tasks being covered: Receive Files, Import, Export, Send Files. By default, all the tasks are pre-selected. In case you want to skip a task, simply clear the corresponding checkbox. Tasks are considered successful in case of bullets next to them remain green-coloured. Failed operations are marked with red bullets:

You can get detailed information on the occurred error by clicking the Show Error Details button.

To get to know how to resolve the issue, see Fix Replication Errors.

Besides, you can set the schedule for monitoring the replication status at a specific time in the future regularly. To do this, select the service to which you got connected and navigate to Scheduler > Adonis Replication:

In the Schedule of Adonis Replication workspace, select Use built-in scheduler to enable the schedule settings:

  • Run Replicator on. Define the weekdays when the task becomes due. By default, all the weekdays are pre-selected. If you want to skip a weekday, clear the corresponding checkbox

  • Start at, Repeat every, Until. In these options, define the frequency interval when the task is due.

  • Replication tasks to be executed. The following replication tasks are available: Receive Files, Import, Export, Send Files. By default, all the tasks are pre-selected. If you want to skip executing a task, clear the corresponding checkbox.

When completed, click Save to confirm the changes or Cancel to revert the changes.

For monitoring purposes, Replicator provides you with the possibility to send/receive such messages:

  • Reports on errors. If an error occurs, it is recommended to send an error report to our dedicated support team.

  • Warning messages. Set up the warning message to be received if no CAB/RPL files have been received for a certain period.

  • Status reports. Set up the status reports to be received as a replication summary over a certain time period.

To enables sending/receiving messages, first, on the left-hand side pane of Replicator Manager, select the service to which you got connected and go to Settings. When the Rel Settings workspace appears, click Run Settings Wizard. Clicking the command opens the Replicator Settings Wizard dialogue. Navigate through the wizard pages until you reach the dialogue where you can set up sending emails with reports and warnings:

Select Send emails with reports and warnings to enable the notification service settings and configure the settings:

  • Server Name. Enter the server name to which to connect.

  • Port. Specify the port number for communication between mail servers.

  • The server requires authentication. Select the checkbox if the connection to the server requires authentication.

  • Use SSL. Select the option to create an encrypted transmission channel.

  • User Name. Enter a username.

  • Password. Enter a password.

  • Sender Email. Enter the sender's email address.

  • Receiver email list. Enter the email address(es) to which the data is to be sent. To do this, click the Add email button and type an email address. To remove an email address from the list, click Delete email.

  • Send reports on errors. Select the checkbox to send error reports.

  • Send Status Reports. Select the checkbox to send status reports. On selecting, set the interval for sending reports.

  • Send warnings if no cab/rpl files have been received recently. Select the checkbox to send warnings. On selecting, define the interval at which Replicator checks whether no files have been received.

While installing/updating Replicator, you are suggested to install the RPL Viewer utility that can open CAB/RPL files and view the content of DAT files:

To monitor the DAT files in Adonis RPL Viewer, first, run the viewer:

Within the application window, click Open File. Clicking the command opens the Open dialogue where you can select the DAT file to be monitored:

Click Open to open the selected file in RPL Viewer:

Now, select a file and click Show Details to view the detailed information on the file:

Monitor in APM

Check Communication Channels

Each site tracks and stores data on import/export, specifically when the recent import/export on incoming/outgoing channels was performed. Therefore, this data can be easily monitored.

To get started, log into APM and navigate to the Setup ribbon tab > Adonis Replicator > Views 2, the Communication Channels tab:

In the grid, there are several columns intended for monitoring purposes. They are Last File Nr, Last Export, Last Import. The data in these columns is updated only after the replication has been successfully performed.

Following the defined schedule, verify that the replication runs correctly. To do this, simply compare the dates in the Communication Channels tab with the scheduled intervals of your replication setup. In case the dates coincide, the replication is running correctly. If they do not coincide, it is recommended to check the error log file.

Set up warnings 

You can set up the warnings to be displayed in case an import/export fails. 

Import/export error log files are located in the Replicator IN/OUT folder, respectively.

For this, within the APM application, navigate to the Setup ribbon tab > Global Options > Modules > Replicator:

Within the dialogue, proceed with the following steps:

  1. Define the path to the Replicator INI file:

    • Click the Browse button to display the Open dialogue.

    • In the dialogue, select the directory where the INI file is located and click Open.

  2. Define Surveillance Options by selecting/clearing the options:

  • Don't check the Existence of log files.

  • Don't check Last Export Time.

  • Don't check Last Import Time.

When launching the application upon logging in, the system proceeds in different ways, based on the selection. If the options are not selected, the system checks for any errors that may have occurred when attempting to export/import data. If there are any errors in IN/OUT folder, the system displays a warning:

Click Yes to confirm sending the log file to the system administrator.

Monitor on Main Site

You can monitor the replication activity of all the satellites on the main site. It is the most time-saving monitoring method.

To get started, simply add new connections to satellites (vessel sites) on the main site.

Before getting started, make sure the main site can access all the sites. In other words, the following points must be met:

  • An IP address is provided, and Adonis Replicator is shared.

  • Ports are open for file transfer.

All the connection details per site are available in the ReplManager.ini file.

To add connections, follow the steps below:

  1. Run Replicator Manager.

  2. On the left-hand pane, right-click Services and Connections. This opens the menu:

  3. In the menu, select Add New Connection... The Connect to Replicator Service dialogue then appears:

  4. Within the dialogue, you can define the connection and authentication properties:

    • Protocol. Select TCP/IP to get connected to another computer.

    • Server Name. Enter the IP address of the computer to which you want to get connected to.

    • Port. Enter the port number required to connect to the server. Usually, the same port number is used for all connections inside one client and may be found in the ReplManager.ini file.

    • Username. Enter the username you have defined while installing the application. The username is available in the ReplManager.ini file.

    • Password. Enter the password you have defined while installing the application. The password is available in the ReplManager.ini file.

  5. When done, click Connect.

Fix Replication Errors

If an error occurs when exporting or importing files import, Replicator creates an error log file in the IN or OUT folder. The file name indicates the site causing the error. For example, an error occurred while importing data from site 10, then the file named 10_err.txt appears in the IN log folder. 

It is not recommended to fix replication errors on your own. Instead, contact our dedicated support team to get assistance: support@adonis.no.


In the grid below, you can find typical errors that may occur when replicating files.



Typical errors and ways to solve them

Comments

Error:

FailedAction = "Import CabFile: 89~52~10~6579.cab - Validate cab file number"

Description = "FileNr out of sequence. Expected FileNr: "6587" and tried to import "6586""

NextAction = General Error - process stopped

common

Solution:

It is necessary to contact the sending site (IT officers) and ask to resend the file(s). Otherwise, if you have access to the site, resend missing file(s) located in the Out\Archive folder. To do this, simply move the file(s) into the Out\Send folder and run the replication.

It may also appear that the missing files are already in the In\Archive folder of the receiving site. In such a case, they are to delete from In/Receive folder.

In case the same files appear in the In/Receive folder, again and again, then it is recommended to delete them from FTP.



Error

FailedAction="ImportCabFile: 196~5~43~57893.cab Task: HOST_REPL_DELETED.dat - DAT file, import row, PK: PK = 462886682 AND SiteNr = 46" Description = "Timeout expired" Source = Microsoft OLE DB Provider for SQL Server ErrorNumber = 80004005 NextAction = Cab file import stopped, all cab files from site 5 will be skipped. Searching for files from other sites to continue



Solution:

Typically, the error is automatically fixed during the next replication. Otherwise, the replication performance becomes low due to a large amount of data. In such a case, run the following script on the Host site:

     select * from repl_deleted where pk=462886682

Then run the result returned from the above query on the site(s) where the issue occurs:

DELETE FROM dbo.PW001P06 WHERE PW001P06.SEQUENCENO = 460520601 AND (( PW001P06.REPL_GMTMODIFIEDDATE < '2013-05-21 16:42:21.960') OR ( PW001P06.REPL_GMTMODIFIEDDATE IS NULL))



Error:

1234 HOST_PW001P0P: [Microsoft][ODBC SQL Server Driver][SQL Server]The transaction ended in the trigger. The batch has been aborted.

Currently, it’s ignored by default.

Solution:

The message indicates that the importing site has modified the same record later than the exporting one. In this case, the changes made earlier are discarded, but the replication stops.

To resolve the issue, you can use the IGNORE_ERROR_NR option and assign the error code that must be ignored during replication.

To avoid stopping the replication process on the error mentioned above, specify IGNORE_ERROR_NR=3609 in the Replicator Settings Wizard dialogue:



Error:

FailedAction = Import File: 349~1~24~76952.cab - Check Archive DBVersion

ErrorDescription = DB version of CAB file (349) is greater (newer) then version of DB (345)

NextAction = Cab file import stopped. All cab files from site 1 will be skipped. Searching for files from other sites to continue.



Solution:

The database version for the file that is being imported (first digits before ~ in the file name) does not coincide with the database name stored inside the database view repl_dbversion.

To resolve the issue, use one of the suggestions below:

  • If the file version is higher than the current database version, upgrade the system to the same version installed on the exporting site.

  • If the file version is lower than the current database version, upgrade it correspondingly.

Version at all sites should be the same

Error:

1) 4278 HOST_PW001OLEDOCS: Rowset cannot be loaded because the stream is invalid.

2) FailedAction = "Import CabFile: 89~53~10~7531.cab - Validate Cab file signature and size" Description = "Incorrect Cab file size or it is not a valid cab file." NextAction = General Error - process stopped

3) 2711 HOST_REPL_DELETED: [Microsoft][ODBC SQL Server Driver][SQL Server]sp_cursorfetch: The cursor identifier value provided (abcdef0) is not valid.

4) 4279 HOST_REPL_DELETED: Subscript out of range



Solution:

The errors occur for the same reason: the files have not been completely downloaded.

Either you can wait until the next replication session occurs, as most probably the files will be in place by that moment, or you can resend the files.



Error:

3182 HOST_PW001P0C: [Microsoft][ODBC SQL Server Driver][SQL Server]Transaction (Process ID 57) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.



Solution:

Delete the error log file and restart the replication services.



Error:

FailedAction = Connect to FTP server

Error Description = User cannot log in. Home directory inaccessible.

NextAction = General Error – FTP Receive stopped



Solution:

To fix this error, go through the Replicator Settings and enter the correct Port number. Besides, make sure that all other settings and credentials are correct (e.g. server name, user name, password, etc.):



Error:

The replication ‘hangs.’



Solution:

To resolve the issue, restart the services of Adonis Replicator. For this, follow the steps below:

  1. In Replicator Manager, click Services and Connections on the left-hand side pane. This opens the Replicator services running at the moment:


  2.  Select the service you want to restart and click Stop Service. The service status is changed from Running to Stopped:

  3. Now, select the service again and click Start Service. The service status is then changed from Stopped to Running.



Error:

FailedAction = "Import CabFile: 309~10~50~20118.cab Task: HOST_PW001P05.dat - DAT file, import row, PK: SEQUENCENO = 100849803"

Description = "The UPDATE statement conflicted with the FOREIGN KEY constraint "FK_PW001P05_DOCNO_OLE". The conflict occurred in database "adonis", table "dbo.PW001OLEDOCS", column 'DOCNO'" Source = Microsoft OLE DB Provider for SQL Server ErrorNumber = 80040E2F

NextAction = Cab file import stopped. All cab files from site 10 will be skipped. Searching for files from other sites to continue

rare

error

Solution:

Script (for our example):

--on-site 50

alter table PW001P05 no check constraint FK_PW001P05_DOCNO_OLE

alter table PW001P05 check constraint FK_PW001P05_DOCNO_OLE

select scanneddocno from PW001P05 where not scanneddocno

in (select DOCNO from PW001OLEDOCS)

--results (100027162)

--on the main site

update PW001OLEDOCS set docno =docno where DOCNO = 100027162



Error:

 FailedAction = Import CabFile: 345~1~11~176994.cab HOST_PW001P0Y.dat - Import row, execute Update command, PK: PIN = 10352

ErrorDescription = Column name or number of supplied values does not match table definition

NextAction = Cab file import stopped, all cab files from site 1 will be skipped. Searching for files from other sites to continue

rare

Solution:

Script:

Add:

ALTER TABLE [dbo].[PW001PAY] ADD [REPLFIELD___id] [int] NULL

ALTER TABLE [dbo].[PW001PAY] ADD [REPLFIELD___LastImportError] [nvarchar](2048) NULL

ALTER TABLE [dbo].[PW001PAY] ADD [REPLFIELD___FirstImportAttempt] [float] NULL

ALTER TABLE [dbo].[PW001PAY] ADD [REPLFIELD___LastImportAttempt] [float] NULL

Remove:

ALTER TABLE [dbo].[PW001P0Y] DROP COLUMN [REPLFIELD___id]

ALTER TABLE [dbo].[PW001P0Y] DROP COLUMN [REPLFIELD___LastImportError]

ALTER TABLE [dbo].[PW001P0Y] DROP COLUMN [REPLFIELD___FirstImportAttempt]

ALTER TABLE [dbo].[PW001P0Y] DROP COLUMN [REPLFIELD___LastImportAttempt]



Error:

ERROR: FailedAction = Export Site to files: 345~24~1~22743.cab, 345~24~1~1.cab - Validate SubFileExtension field value

ErrorDescription = Action was interrupted by user

NextAction = General Error - task stopped

common

Solution:

Typically, the error is automatically fixed during the next replication.



Error:

FailedAction = Upload the rest part of 309~50~10~19477.cab file to FTP

ErrorDescription = 309~50~10~19477.cab.upl: Append/Restart not permitted, try again

NextAction = File is ignored, continue from next file

common

Solution:

Check FTP. The file may be of a zero-byte size or with a "download” extension. It also can be used by another process. Delete it and run the replication on the client’s server.



Error:

FailedAction = Import CabFile: 349~10~52~10836336.rpl TASK: SAT_AUDIT_PW001P01.dat - Reading primary key list of "AUDIT_PW001P01" table

ErrorDescription = [Microsoft] [ODBC SQL Server Driver]Query timeout expired

NextAction = Cab file import stopped. All cab files from site 10 will be skipped. Searching for files from other sites to continue

rare

Solution:

Restart the services on site 52.



Error:

FailedAction = Connect to FTP server

ErrorDescription = Read Timeout

NextAction = General Error - FTP Receive stopped



Solution:

Typically, the error is automatically fixed during the next replication.



Error:

FailedAction = Import CabFile: 375~1~22~3763~1.cab Task: HOST_PW001OLEDOCS.dat - Import row, execute Insert command, PK: DOCNO=50256795, SQL statement: IF NOT EXISTS (SELECT 1 FROM dbo.PW001OLEDOCS WHERE DOCNO = ?) AND NOT EXISTS (SELECT 1 FROM dbo.REPL_DELETED_LOG WHERE TABLENAME = 'PW001OLEDOCS' AND PKS = ? AND GMTMODIFIEDDATE > ?)

INSERT INTO dbo.PW001OLEDOCS (DOCNO, DOCUMENT, repl_ModifiedBySite, repl_ModifiedDate, REPL_GMTMODIFIEDDATE) VALUES (?, ?, ?, ?, ?)

ErrorDescription = Cannot insert the value NULL into column 'CREATEDBY', table 'ADONIS_PROD.dbo.PW001OLEDOCS'; column does not allow nulls. INSERT fails.

Ignored by default from ver. 2018.1.0.45

Solution:

Add the 3621 error code to the DB errors to be ignored field in the Replicator Settings Wizard dialogue :



Error:

FailedAction = Import File: 376~10~56~42779.rpl - Extracting task file

ErrorDescription = Cabinet file is corrupt [0x0004] NextAction = Cab file import stopped, all cab files from site 10 will be skipped. Searching for files from other sites to continue

common

Solution:

1) Delete the corrupted file in the IN/Receive folder (at site 56).

2) Resend the target file from site 10 (main site): from OUT/Achieve copy to the Send folder and run the replication.

3) Then run the replication on site 56.



Error:

FailedAction = Import File: 376~10~55~10777719.rpl - Validate RPL file

ErrorDescription = File 376~10~55~10777719.rpl is not valid RPL file

NextAction = Cab file import stopped. All cab files from site 10 will be skipped. Searching for files from other sites to continue

common

Solution:

If the file is pending on FTP, delete it. Then on site 55 in the In/Receive folder, delete this file.

After that, resend this file from site 10 to site 55, run the replication on site 55.



Error:

FailedAction = Import CabFile: 550~1~15~129759.cab, import records from storage of "PW001C78" table or FailedAction = Import CabFile: 550~1~44~41106.cab Task: HOST_PW001C78.dat - Import row, execute Update command, PK: CODE='305', SQL statement: UPDATE dbo.PW001C78

common

Solution:

  1. Disable Scheduler by disabling Use built-in-scheduler option and then Save.

  2. In Settings > Run Settings Wizard, disable Stop Import on error option in this dialogue window:

  3. Move problem file from In/Receive folder to In folder.

  4. Move all remaining files from In/Receive folder to In/temp folder.

  5. Put back problem file to In/Receive folder.

  6. Run replication, but only enable Import option.

  7. After running replication, check if In/Receive folder is empty.

  8. Put back all files from In/temp folder to In/Receive folder.

  9. Enable again Stop Import on error option in Run settings Wizard.

  10. Run replication with all options enabled.

From here there are two scenarios:
If error still persists, then please repeat steps 2-10.
If no more errors, then turn on again Scheduler.