sync
plantdb.commons.sync Link
Synchronization Utility for PlantDB Databases
This module provides a robust synchronization mechanism for File System Databases (FSDB) in the PlantDB project, enabling seamless data transfer between local and remote database instances.
Key FeaturesLink
- Support for local and remote database synchronization
- Intelligent file transfer with modification time and size checking
- Automatic locking and unlocking of source and target databases
- Recursive directory synchronization
- SSH-based remote file transfer using SFTP
- Error handling and validation of database paths
Usage ExamplesLink
Create two test databases, a source with a dataset and a target without dataset, then sync them.
>>> from plantdb.commons.sync import FSDBSync
>>> from plantdb.commons.test_database import test_database
>>> # Create a test source database
>>> db_source = test_database()
>>> db_source.connect()
>>> print(db_source.list_scans()) # list scans in the source database
['real_plant_analyzed']
>>> db_source._unlock_db() # Unlock the database (remove lock file)
>>> # Create a test target database
>>> db_target = test_database(dataset=None)
>>> db_target.connect()
>>> print(db_target.list_scans()) # verify that target database is empty
[]
>>> db_target._unlock_db() # Unlock the database (remove lock file)
>>> # Sync target database with source
>>> db_sync = FSDBSync(db_source.path(), db_target.path())
>>> db_sync.sync()
>>> # List directories in the target database
>>> print([i for i in db_target.path().iterdir()])
>>> # Reload target database to ensure that the new scans are available
>>> db_target.reload()
>>> print(db_target.list_scans()) # verify that target database contains 1 new scan
['real_plant_analyzed']
>>> db_source.disconnect() # Remove the test database
>>> db_target.disconnect() # Remove the test database
FSDBSync Link
FSDBSync(source, target)
Class for sync between two FSDB databases.
It checks for the validity of both source and target by checking that:
- there is a marker file in the DB path root
- the DB is not busy by checking for the lock file in the DB path root.
It locks the two databases during the sync. The sync is done using rsync as a subprocess
Attributes:
Name | Type | Description |
---|---|---|
source_str |
str
|
Source path |
target_str |
str
|
Target path |
source |
dict
|
Source path description |
target |
dict
|
Target path description |
ssh_clients |
dict
|
Dictionary of SSH clients, keyed by host name. |
Examples:
>>> from plantdb.commons.sync import FSDBSync
>>> from plantdb.commons.test_database import test_database
>>> # Example: Create two test databases, a source with a dataset and a target without dataset, then sync them.
>>> # Create a test source database
>>> db_source = test_database()
>>> db_source.connect()
>>> print(db_source.list_scans()) # list scans in the source database
['real_plant_analyzed']
>>> db_source._unlock_db() # Unlock the database (remove lock file)
>>> # Create a test target database
>>> db_target = test_database(dataset=None)
>>> db_target.connect()
>>> print(db_target.list_scans()) # verify that target database is empty
[]
>>> db_target._unlock_db() # Unlock the database (remove lock file)
>>> # Sync target database with source
>>> db_sync = FSDBSync(db_source.path(), db_target.path())
>>> db_sync.sync()
>>> # List directories in the target database
>>> print([i for i in db_target.path().iterdir()])
>>> # Reload target database to ensure that the new scans are available
>>> db_target.reload()
>>> print(db_target.list_scans()) # verify that target database contains 1 new scan
['real_plant_analyzed']
>>> db_source.disconnect() # Remove the test database
>>> db_target.disconnect() # Remove the test database
Class constructor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source
|
str or Path
|
Source database path (remote or local) |
required |
target
|
str or Path
|
Target database path (remote or local) |
required |
ssh_clients
|
dict
|
Dictionary of SSH clients, keyed by host name. |
required |
Source code in plantdb/commons/sync.py
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
|
__del__ Link
__del__()
Ensure unlocking on object destruction.
Source code in plantdb/commons/sync.py
155 156 157 158 159 160 |
|
lock Link
lock()
Lock both source and target databases prior to sync.
Source code in plantdb/commons/sync.py
162 163 164 165 166 167 168 |
|
sync Link
sync()
Sync the two DBs using modern Python approaches with proper locking.
Source code in plantdb/commons/sync.py
229 230 231 232 233 234 235 236 237 238 239 |
|
unlock Link
unlock()
Unlock both source and target databases after sync.
Source code in plantdb/commons/sync.py
170 171 172 173 174 175 176 |
|