Transferring Data To and From File Storage with Lustre
Many common use cases for File Storage with Lustre include the transfer of a large amount of data. Based on the origination, destination, and the direction of the data transfer, the best method to accomplish that transfer can vary.
The following table provides recommendations for common File Storage with Lustre data transfer scenarios.
For general information about private connections between OCI and on-premises data, see FastConnect and Site-to-Site VPN.
Transfer Data From... | To... | Recommended Method | Prerequisites and Considerations |
---|---|---|---|
OCI File Storage with Lustre | On-premises S3 Object Storage |
Use |
Instance should be able to connect to the Object Storage bucket. |
OCI File Storage with Lustre | On-premises file system (local disk, SAN or NAS) | Linux users can use instance-to-instance streaming and the fpsync tool to transfer data from OCI. For some examples, see Transferring On-Premises Data to File Storage with Lustre (Linux). The same technique can be used in reverse. |
Ensure that network connectivity is established between source instance and destination. |
Transferring On-Premises Data to File Storage with Lustre (Linux)
The fpsync
tool is a parallel wrapper of rsync
. Linux users can download fpsync
from a yum repository. The commands differ depending on the version of Linux.
-
Download from the repository.
Linux 8 users can download the tool using the following command:
sudo yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
Linux 9 users can download the tool using the following command:
sudo yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
- Install the tool:
sudo yum install fpart -y
Before beginning the data transfer, complete the following prerequisites:
- Ensure that network connectivity is established between the on-premises data source and OCI. Use FastConnect or Site-to-Site VPN connection to enable fast instance-to-instance streaming over SSH.
- Create an Oracle Linux instance in OCI.
- Attach or mount the on-premises storage share on a Linux server. A dedicated instance is recommended.
Use the fpsync
tool to perform an initial copy of on-premises data to OCI
File Storage with Lustre. Then, incremental data changes can be synchronized using rsync
because fpsync
can't delete files and folders in the destination that don't exist in the source.
In this scenario, we suggest that the initial copy uses fpsync
. Later, incremental syncs use rsync
because fpsync
doesn't have the --delete
option.
fpsync
options, see the fpsync
man page.Using Instance-to-Instance Streaming to Transfer File Storage with Lustre Data
The fpsync
tool is a parallel wrapper of rsync
. You can use fpsync
and instance-to-instance streaming to transfer data between mounted File Storage with Lustre file systems.
To install fpsync
, enable the Oracle Linux developer repository, which includes the fpsync
utility, on the OCI instance using a command such as the following. The command differs based on the version of Oracle Linux in use:
yum --enablerepo ol7_developer_EPEL install -y fpart
yum --enablerepo ol8_developer_EPEL install -y fpart
After installing the tool, use an instance-to-instance streaming command such as this to stream data:
fpsync -o "-e ssh --progress" /<src_path>/test <ssh_user>@<remote_ip>:/<dest_path>/
For more information and options, see the fpsync
man page.
An example showing the performance difference between the two approaches follows:
# date; time fpsync -o "-e ssh --progress --log-file ~/speedtest.log" /src_path/test/ root@OCI_lfsclient:/lfs_dest_path/ ; date
Sun Mar 13 15:22:58 GMT 2022
real 0m1.467s
user 0m0.111s
sys 0m0.075s
Sun Mar 13 15:23:00 GMT 2022
# ls -ltrd test
drwxr-xr-x. 2 root root 1 Mar 13 15:22 test
# du -sh test
1001M test
# cp -r test test1
# date; time fpsync -o "--progress --log-file ~/speedtest1.log" /src_path/test/ /lfs_dest_path/ ; date
Sun Mar 13 15:25:16 GMT 2022
real 1m28.847s
user 0m3.688s
sys 0m1.439s
Sun Mar 13 15:26:44 GMT 202