Skip to content

sync

plantdb.commons.sync Link

Synchronization Utility for PlantDB Databases

This module provides a robust synchronization mechanism for File System Databases (FSDB) in the PlantDB project, enabling seamless data transfer between local and remote database instances.

Key FeaturesLink

  • Support for local and remote database synchronization
  • Intelligent file transfer with modification time and size checking
  • Automatic locking and unlocking of source and target databases
  • Recursive directory synchronization
  • SSH-based remote file transfer using SFTP
  • Error handling and validation of database paths

Usage ExamplesLink

Create two test databases, a source with a dataset and a target without dataset, then sync them.

>>> from plantdb.commons.sync import FSDBSync
>>> from plantdb.commons.test_database import test_database
>>> # Create a test source database
>>> db_source = test_database()
>>> db_source.connect()
>>> print(db_source.list_scans())  # list scans in the source database
['real_plant_analyzed']
>>> db_source._unlock_db()  # Unlock the database (remove lock file)
>>> # Create a test target database
>>> db_target = test_database(dataset=None)
>>> db_target.connect()
>>> print(db_target.list_scans())  # verify that target database is empty
[]
>>> db_target._unlock_db()  # Unlock the database (remove lock file)
>>> # Sync target database with source
>>> db_sync = FSDBSync(db_source.path(), db_target.path())
>>> db_sync.sync()
>>> # List directories in the target database
>>> print([i for i in db_target.path().iterdir()])
>>> # Reload target database to ensure that the new scans are available
>>> db_target.reload()
>>> print(db_target.list_scans())  # verify that target database contains 1 new scan
['real_plant_analyzed']
>>> db_source.disconnect()  # Remove the test database
>>> db_target.disconnect()  # Remove the test database

FSDBSync Link

FSDBSync(source, target)

Class for sync between two FSDB databases.

It checks for the validity of both source and target by checking that:

  • there is a marker file in the DB path root
  • the DB is not busy by checking for the lock file in the DB path root.

It locks the two databases during the sync. The sync is done using rsync as a subprocess

Attributes:

Name Type Description
source_str str

Source path

target_str str

Target path

source dict

Source path description

target dict

Target path description

ssh_clients dict

Dictionary of SSH clients, keyed by host name.

Examples:

>>> from plantdb.commons.sync import FSDBSync
>>> from plantdb.commons.test_database import test_database
>>> # Example: Create two test databases, a source with a dataset and a target without dataset, then sync them.
>>> # Create a test source database
>>> db_source = test_database()
>>> db_source.connect()
>>> print(db_source.list_scans())  # list scans in the source database
['real_plant_analyzed']
>>> db_source._unlock_db()  # Unlock the database (remove lock file)
>>> # Create a test target database
>>> db_target = test_database(dataset=None)
>>> db_target.connect()
>>> print(db_target.list_scans())  # verify that target database is empty
[]
>>> db_target._unlock_db()  # Unlock the database (remove lock file)
>>> # Sync target database with source
>>> db_sync = FSDBSync(db_source.path(), db_target.path())
>>> db_sync.sync()
>>> # List directories in the target database
>>> print([i for i in db_target.path().iterdir()])
>>> # Reload target database to ensure that the new scans are available
>>> db_target.reload()
>>> print(db_target.list_scans())  # verify that target database contains 1 new scan
['real_plant_analyzed']
>>> db_source.disconnect()  # Remove the test database
>>> db_target.disconnect()  # Remove the test database

Class constructor.

Parameters:

Name Type Description Default
source str or Path

Source database path (remote or local)

required
target str or Path

Target database path (remote or local)

required
ssh_clients dict

Dictionary of SSH clients, keyed by host name.

required
Source code in plantdb/commons/sync.py
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
def __init__(self, source, target):
    """Class constructor.

    Parameters
    ----------
    source : str or pathlib.Path
        Source database path (remote or local)
    target : str or pathlib.Path
        Target database path (remote or local)
    ssh_clients : dict
        Dictionary of SSH clients, keyed by host name.
    """
    self.source_str = source
    self.target_str = target
    self.source = _fmt_path(source)
    self.target = _fmt_path(target)
    self.ssh_clients = {}  # Store SSH connections

__del__ Link

__del__()

Ensure unlocking on object destruction.

Source code in plantdb/commons/sync.py
155
156
157
158
159
160
def __del__(self):
    """Ensure unlocking on object destruction."""
    try:
        self.unlock()
    except:
        return

lock Link

lock()

Lock both source and target databases prior to sync.

Source code in plantdb/commons/sync.py
162
163
164
165
166
167
168
def lock(self):
    """Lock both source and target databases prior to sync."""
    for db in [self.source, self.target]:
        if db["type"] == "local":
            self._lock_local(db)
        else:
            self._lock_remote(db)

sync Link

sync()

Sync the two DBs using modern Python approaches with proper locking.

Source code in plantdb/commons/sync.py
229
230
231
232
233
234
235
236
237
238
239
def sync(self):
    """Sync the two DBs using modern Python approaches with proper locking."""
    self.lock()
    try:
        if self.source["type"] == "local" and self.target["type"] == "local":
            self._sync_local()
        else:
            self._sync_remote()
    finally:
        self.unlock()
        self._close_ssh_connections()

unlock Link

unlock()

Unlock both source and target databases after sync.

Source code in plantdb/commons/sync.py
170
171
172
173
174
175
176
def unlock(self):
    """Unlock both source and target databases after sync."""
    for db in [self.source, self.target]:
        if db["type"] == "local":
            self._unlock_local(db)
        else:
            self._unlock_remote(db)