Using File Storage Parallel Tools
The Parallel File Tools suite provides parallel versions of tar
,
rm
, and cp
. These tools can run requests on large
file systems in parallel, maximizing performance for data protection operations.
The toolkit includes:
partar
: Use this command to create and extract tarballs in parallel.Note
Thepartar
tool supports the extraction oftar
files created in the GNU basictar
POSIX 1003.1-1990 format. Files created in other archive formats, such asPAX
, are not supported.parrm
: You can use this command to recursively remove a directory in parallel.parcp
: Use this command to recursively copy a directory in parallel.
Installing the Parallel File Tools
The tool suite is distributed as an RPM for Oracle Linux, Red Hat Enterprise Linux, and CentOS.
To install Parallel File Tools on an Oracle Linux instance:
- Open a terminal window on the destination instance.
- Type the following
command:
sudo yum install -y fss-parallel-tools
To install Parallel File Tools on an Oracle Linux 8 instance:
- Open a terminal window on the destination instance.
- Install the Oracle Linux developer repository, if needed, by using the following
command:
dnf install oraclelinux-developer-release-el8
- Install the Parallel File Tools from the developer repository using the following
command:
dnf --enablerepo=ol8_developer install fss-parallel-tools
To install Parallel File Tools on CentOS and Red Hat 6.x:
- Open a terminal window on the destination instance.
- Type the following
command:
sudo wget http://yum.oracle.com/public-yum-ol6.repo -O /etc/yum.repos.d/public-yum-ol6.repo sudo wget http://yum.oracle.com/RPM-GPG-KEY-oracle-ol6 -O /etc/pki/rpm-gpg/RPM-GPG-KEY-oracle sudo yum --enablerepo=ol6_developer install fss-parallel-tools
- Open a terminal window on the destination instance.
- Type the following
command:
sudo wget http://yum.oracle.com/public-yum-ol7.repo -O /etc/yum.repos.d/public-yum-ol7.repo sudo wget http://yum.oracle.com/RPM-GPG-KEY-oracle-ol7 -O /etc/pki/rpm-gpg/RPM-GPG-KEY-oracle sudo yum --enablerepo=ol7_developer install fss-parallel-tools
Using the Tools - Basic Examples
Here are some simple examples of how the different tools are commonly used in Oracle Cloud Infrastructure File Storage.
In this example, parcp
is used to copy the directory "folder" in /source
to /destination
. The -P
option is used to set the number of parallel threads you want to use.
$parcp -P 16 /source/folder /destination
In the following example, parcp
is used to copy the contents of the directory "folder" in /source
to /destination
. The "folder" directory itself is not copied.
$parcp -P 16 /source/folder/. /destination
.tar
archive of the contents of the specified directory, and stores it as a tarball
in the directory. In the example below, the name of the directory that is being used to create the tarball is example
. $partar pcf example.tar example -P 16
example
. The tarball is being created in the /test
directory.$partar pcf example.tar example -P 16 -C /test
Using the Tools - Advanced Examples
Here are some examples of how the different tools are used in more advanced scenarios.
You can specify which files and folders are included when you create a .tar
archive using partar
. Let's say you have a directory that looks like this:
[opc@example sourcedir]$ ls -l
total 180
-rw-r-----. 1 opc opc 0 Apr 15 02:55 example2020-04-15_02-55-33_217107549.error
-rw-r-----. 1 opc opc 10 Apr 15 03:18 example2020-04-15_02-55-33_217107549.log
-rw-rw-r--. 1 opc opc 12 Apr 15 03:18 example2020-04-15_03-18-13_267771997.error
-rw-rw-r--. 1 opc opc 10 Apr 15 03:18 example2020-04-15_03-18-13_267771997.log
-rwxr-xr-x. 1 opc opc 37 Nov 30 2017 File1.txt
-rwxr-xr-x. 1 opc opc 15 Dec 1 2017 File2.txt
-rwxr-xr-x. 1 opc opc 39 Nov 30 2017 File3.txt
-rwxr-xr-x. 1 opc opc 57 Dec 1 2017 File4.txt
The following command creates a .tar
archive that:
- Contains a
mydir
directory named as specified. - Includes
File1.txt
,File2.txt
,File3.txt
, andFile4.txt
. - Excludes all
.log
and.error
files. - Sends the
.tar
ball from/sourcedir
to/mnt/destinationdir
- Extracts the
.tar
archive
[opc@example sourcedir]$ sudo partar cf - mydir --exclude '*.log*' --exclude '*.err*' | sudo partar xf - -C /mnt/destinationdir
Performing ls -l
on /mnt/destinationdir/mytar
shows that only the desired files have been copied.
[opc@example mytar]$ ls -l
total 148
-rwxr-xr-x. 1 opc opc 37 Nov 30 2017 File1.txt
-rwxr-xr-x. 1 opc opc 15 Dec 1 2017 File2.txt
-rwxr-xr-x. 1 opc opc 39 Nov 30 2017 File3.txt
-rwxr-xr-x. 1 opc opc 57 Dec 1 2017 File4.txt
When excluding a directory or file from the archive, provide only the name of the directory or file. The --exclude
option does not support use of an absolute path. Using an absolute path in the --exclude
option will not exclude the specified directory or files from the .tar
archive. For example, if you need to exclude a directory called testing
from the path of the source directory, you would specify that in a command like the following:
sudo partar pczf name_of_tar_file.tar.gz /<path_source_directory> --exclude=testing
All files or directories that match the
--exclude
pattern under the path of the source directory will be excluded from the partar
archive.You can specify which files and folders are included when you use parcp
to copy from one directory to another. Let's say you have a directory that looks like this:
[opc@example sourcedir]$ ls -l
total 180
-rw-r-----. 1 opc opc 0 Apr 15 02:55 example2020-04-15_02-55-33_217107549.error
-rw-r-----. 1 opc opc 10 Apr 15 03:18 example2020-04-15_02-55-33_217107549.log
-rw-rw-r--. 1 opc opc 12 Apr 15 03:18 example2020-04-15_03-18-13_267771997.error
-rw-rw-r--. 1 opc opc 10 Apr 15 03:18 example2020-04-15_03-18-13_267771997.log
-rwxr-xr-x. 1 opc opc 37 Nov 30 2017 File1.txt
-rwxr-xr-x. 1 opc opc 15 Dec 1 2017 File2.txt
-rwxr-xr-x. 1 opc opc 39 Nov 30 2017 File3.txt
-rwxr-xr-x. 1 opc opc 57 Dec 1 2017 File4.txt
First, create a .txt
file containing a list of files you want to exclude. In this example, it's /home/opc/list.txt
.
The following command copies the contents from sourcedir
to /mnt/destinationdir
and:
- Copies
File1.txt
,File2.txt
, andFile3.txt
. - Excludes
File4.txt
and the.log
and.error
files, as listed in/home/opc/list.txt
.
[opc@example ~]$ cat /home/opc/list.txt
File4.txt
*.log*
*.err*
[opc@example ~]$ date; time sudo parcp --exclude-from=/home/opc/list.txt -P 16 --restore /sourcedir /mnt/destinationdir;
date Mon Jun 1 15:58:30 GMT 2020
real 9m55.820s
user 0m3.602s
sys 1m5.441s
Mon Jun 1 16:08:25 GMT 2020
ls -l
on /mnt/destinationdir
shows that only the desired files have been copied.[opc@example destinationdir]$ ls -l
total 91
-rwxr-xr-x. 1 opc opc 37 Nov 30 2017 File1.txt
-rwxr-xr-x. 1 opc opc 15 Dec 1 2017 File2.txt
-rwxr-xr-x. 1 opc opc 39 Nov 30 2017 File3.txt
The --restore
option in parcp
is similar to using the -a -r -x
and -H
options in rsync
. (See rsync(1)- Linux Man Page.) The -P
option is used to set the number of parallel threads you want to use.
The restore
option includes the following behavior:
- Recurse into directories
- Stop at file system boundaries
- Preserve hard links, symlinks, permissions, modification times, group, owners, and special files such as
named sockets
andfifo
files
$parcp -P 16 --restore /source/folder/ /destination
You can use parcp
with the --restore
and --delete
options to sync files between a source and target folder. This is a good substitute for using rsync
in parallel. As files are added or removed from the source directory, you can run this command at regular intervals to add or remove the same files from the destination directory. You can automate syncing by using this command option in a cron job.
sudo parcp -P 32 --restore --delete /source/folder/ /destination