Skip to content

fsdb_rest_api_sync

plantdb.client.cli.fsdb_rest_api_sync Link

FSDB Synchronization via REST API

This module provides functionality to synchronize scan archives between two FSDB (Plant Database) instances using REST API. It ensures data consistency and aids in migrating or backing up scan data efficiently between different database instances.

Key Features
  • Command-line Interface (CLI): Includes an argument parser for easy configuration of source and target databases.
  • URL Validation: Verifies the format and accessibility of the provided database URLs.
  • Scan Synchronization: Retrieves scan archives from the source database and uploads them to the target database.
  • REST API Integration: Utilizes REST API calls for secure and efficient data transfer.
  • Progress Tracking: Displays a progress bar with tqdm to show transfer status for each scan.
  • Logging: Offers informational and error logging to provide feedback during the synchronization process.
  • Error Handling: Handles common errors such as invalid URLs, inaccessible ports, and connectivity issues.
Usage Examples

To synchronize two FSDB databases, run the script from the command line with the origin and target arguments in the format host:port. Temporary files are stored in the /tmp directory during the transfer process.

python fsdb_rest_api_sync.py 192.168.1.1:5000 192.168.1.2:5000

This command connects the origin database at 192.168.1.1:5000 to the target database at 192.168.1.2:5000 and transfers all scan archives.

To filter the transferred scans based on a regular expression, use the --filter optional argument. For example, to transfer only scans with names starting with 'virtual_':

python fsdb_rest_api_sync.py 192.168.1.1:5000 192.168.1.2:5000 --filter 'virtual_*'

Testing

To test this script, you can set up two test FSDB REST API instances locally, with each instance running on a different port. One instance should start with sample data, while the other should be empty. You can then use this script to transfer data from the first instance to the second.

For example:

  1. In the first terminal, run the following command:
    fsdb_rest_api --port 5001 --test
    
    This will create a database with sample data that listens on port 5001.
  2. In the second terminal, run:
    fsdb_rest_api --port 5002 --test --empty
    
    This will create an empty database that listens on port 5002.
  3. Finally, in the third terminal, run the sync script:
    fsdb_rest_api_sync 127.0.0.1:5001 127.0.0.1:5002
    
    This command transfers the data from the instance running on port 5001 to the instance running on port 5002.

filter_scan Link

filter_scan(scan_list, filter_pattern, logger)

Filters a list of scan identifiers based on a regular expression pattern.

Parameters:

Name Type Description Default
scan_list list of str

A list of scan identifiers to be filtered.

required
filter_pattern str

A regular expression pattern used to filter the scan identifiers.

required
logger Logger

A logger instance for logging messages.

required

Returns:

Type Description
list of str or None

A list containing only the scan identifiers that match the regular expression pattern, or None if the pattern is invalid.

Source code in plantdb/client/cli/fsdb_rest_api_sync.py
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
def filter_scan(scan_list, filter_pattern, logger):
    """Filters a list of scan identifiers based on a regular expression pattern.

    Parameters
    ----------
    scan_list : list of str
        A list of scan identifiers to be filtered.
    filter_pattern : str
        A regular expression pattern used to filter the scan identifiers.
    logger : logging.Logger
        A logger instance for logging messages.

    Returns
    -------
    list of str or None
        A list containing only the scan identifiers that match the
        regular expression pattern, or None if the pattern is invalid.
    """
    try:
        regex = re.compile(filter_pattern)
        scan_list = [scan_id for scan_id in scan_list if regex.search(scan_id)]
        return scan_list
    except re.error as e:
        logger.error(f"Invalid regular expression '{filter_pattern}': {e}")
    return

main Link

main()

Main execution function.

This function orchestrates the parsing of command-line arguments, validation of the provided database URLs, and the synchronization of scan archives between the origin and target databases.

Source code in plantdb/client/cli/fsdb_rest_api_sync.py
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
def main():
    """Main execution function.

    This function orchestrates the parsing of command-line arguments, validation of the
    provided database URLs, and the synchronization of scan archives between the origin
    and target databases.
    """
    # Parse command-line arguments using the parsing function
    parser = parsing()
    # Extract the parsed arguments
    args = parser.parse_args()

    # Validate the availability of origin and target database URLs
    test_host_port_availability(args.origin)
    test_host_port_availability(args.target)

    # Synchronize scan archives between the origin and target databases
    sync_scan_archives(args.origin, args.target, args.filter, args.log_level)  # Perform synchronization

parsing Link

parsing()

Parses command line arguments for database synchronization.

Returns:

Type Description
ArgumentParser

An argument parser configured for FSDB synchronization.

Source code in plantdb/client/cli/fsdb_rest_api_sync.py
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
def parsing():
    """Parses command line arguments for database synchronization.

    Returns
    -------
    argparse.ArgumentParser
        An argument parser configured for FSDB synchronization.
    """
    parser = argparse.ArgumentParser(description='Transfer dataset from an origin FSDB database towards a target using REST API.')
    parser.add_argument('origin', type=str,
                        help='source database URL, as "host:port".')
    parser.add_argument('target', type=str,
                        help='target database URL, as "host:port".')

    parser.add_argument('--filter', type=str, default=None,
                        help='optional regular expression to filter scan names.')

    log_opt = parser.add_argument_group("Logging options")
    log_opt.add_argument("--log-level", dest="log_level", type=str, default=DEFAULT_LOG_LEVEL, choices=LOG_LEVELS,
                         help="Level of message logging, defaults to 'INFO'.")

    return parser

sync_scan_archives Link

sync_scan_archives(origin_url, target_url, filter_pattern=None, log_level=DEFAULT_LOG_LEVEL)

Synchronizes scan archives between an origin and a target database.

This function retrieves a list of scan archives from an origin database and transfers each scan to a target database. It relies on REST API calls to download the scan archives from the origin and upload them to the target. Optionally filters the list of scans using a regular expression.

Parameters:

Name Type Description Default
origin_url str

A correctly formatted URL pointing to the origin plantDB database with a REST API enabled.

required
target_url str

A correctly formatted URL pointing to the target plantDB database with a REST API enabled.

required
filter_pattern str

A regular expression to filter scan names. Defaults to None, which means no filtering is applied.

None
log_level str

The log level for the logger. Defaults to DEFAULT_LOG_LEVEL, which is 'INFO'.

DEFAULT_LOG_LEVEL

Raises:

Type Description
SystemExit

If there are errors parsing command-line arguments.

Notes
  • The method assumes that both databases are accessible and have a REST API enabled.
  • The method refreshes the scan in the target database after each transfer.
  • Temporary files are stored in the /tmp directory during the transfer process and deleted afterward.
Source code in plantdb/client/cli/fsdb_rest_api_sync.py
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
def sync_scan_archives(origin_url, target_url, filter_pattern=None, log_level=DEFAULT_LOG_LEVEL):
    """Synchronizes scan archives between an origin and a target database.

    This function retrieves a list of scan archives from an origin database and transfers each
    scan to a target database.
    It relies on REST API calls to download the scan archives from the origin and upload them to the target.
    Optionally filters the list of scans using a regular expression.

    Parameters
    ----------
    origin_url : str
        A correctly formatted URL pointing to the origin plantDB database with a REST API enabled.
    target_url : str
        A correctly formatted URL pointing to the target plantDB database with a REST API enabled.
    filter_pattern : str, optional
        A regular expression to filter scan names.
        Defaults to ``None``, which means no filtering is applied.
    log_level : str, optional
        The log level for the logger.
        Defaults to ``DEFAULT_LOG_LEVEL``, which is 'INFO'.

    Raises
    ------
    SystemExit
        If there are errors parsing command-line arguments.

    Notes
    -----
    - The method assumes that both databases are accessible and have a REST API enabled.
    - The method refreshes the scan in the target database after each transfer.
    - Temporary files are stored in the `/tmp` directory during the transfer process and deleted afterward.
    """
    logger = get_logger('fsdb_rest_api_sync', log_level=log_level)

    parsed_origin = urlparse(origin_url)
    origin_host, origin_port = parsed_origin.hostname, parsed_origin.port
    parsed_target = urlparse(target_url)
    target_host, target_port = parsed_target.hostname, parsed_target.port
    logger.info(f"Origin URL is '{origin_host}' on port '{origin_port}'.")
    logger.info(f"Target URL is '{target_host}' on port '{target_port}'.")

    scan_list = list_scan_names(host=origin_host, port=origin_port)
    if filter_pattern:
        scan_list = filter_scan(scan_list, filter_pattern, logger)
        logger.info(f"{len(scan_list)} scans match the filter pattern '{filter_pattern}'.")
    else:
        logger.info(f"Found {len(scan_list)} scans in origin database.")

    for scan_id in tqdm(scan_list, desc="Transfer:", unit="scan"):
        logger.debug(f"Transferring scan '{scan_id}'...")
        # Download from origin DB via REST API
        f_path, msg = download_scan_archive(scan_id, out_dir='/tmp', host=origin_host, port=origin_port)
        logger.debug(msg)
        # Upload to target DB via REST API
        msg = upload_scan_archive(scan_id, f_path, host=target_host, port=target_port)
        logger.debug(msg)
        # Delete the temporary file
        Path(f_path).unlink()
        # Refresh the scan in the target to load its infos:
        msg = refresh(scan_id, host=target_host, port=target_port)
        logger.debug(msg)

    return