sFTP is still used on a daily basis. Many times the simpler well proven technologies just do the job, just like a hammer. Today’s post will show how to use Python to connect to an sFTP site and securely upload or download your files.
Many times on stackoverflow you tend to get incomplete or partial samples of code that don’t explain step by step what you need to do. And with sFTP every step is critical. The code presented is not production ready; i.e. no Exception blocks but it wouldn’t take much more effort to adapt the code to be production ready. Instead we will concentrate on the important parts and get you downloading files quickly. Let’s go!
Directory Structure
The directory structure will look like the following:
main.py
This is the file we will invoke to start the program. It is rather simple.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
# -- LIBRARY ----------------------------------------------------------------- import sys # -- LOCAL FILES ------------------------------------------------------------- from src.file_upload import FileUpload def main(argv) -> None: file_list = ['file_one.txt', 'file_two.txt'] # upload files to sftp if len(file_list) > 0: fp = FileUpload() fp.send_files(file_list=file_list) # Start the program if __name__ == '__main__': main(sys.argv[1:]) |
17 18 19 |
# Start the program if __name__ == '__main__': main(sys.argv[1:]) |
The file is using the sys.argv[1:] to allow you to pass in arguments when you are running the program. In an effort to keep focus, we won’t being using arguments, and will just hard code the file names in the main function.
Now let’s go line by line.
5 6 |
from src.file_upload import FileUpload |
This is the FileUpload class we will create later.
7 8 9 10 11 12 13 14 15 16 17 18 19 |
def main(argv) -> None: file_list = ['file_one.txt', 'file_two.txt'] # upload files to sftp if len(file_list) > 0: fp = FileUpload() fp.send_files(file_list=file_list) # Start the program if __name__ == '__main__': main(sys.argv[1:]) |
The main function, pretty simple isn’t it?
file_list is a list of files we will upload to the sFTP server. Conversely you could have a list of files you wish to download.
Next we check if the file_list has any values in it; in case the list is being passed in via sys.argv.
Next we assign fp to the FileUpload class
Last step is we send the files, by calling the send_files method in the class.
Configuration File
In the configuration directory we have three files.
- __init__.py
- known_hosts
- sftp_config.yaml
We are going to look at the sftp_config.yaml configuration file and the __init__.py file first. We will come back to the known_hosts file later on.
sftp_config.yaml
The configuration file will store the details on how to connect to the sFTP server and what files we want to upload.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# ---------------------------------------------------------------------------- # SFTP CONFIGURATION OPTIONS # ---------------------------------------------------------------------------- # sftp connection values --- SERVER: HOSTNAME: 'THESFTP-SERVER' USERNAME: 'myusername' KEY_FILENAME: 'C:\\Users\\my_name\\.ssh\\thekey.pem' KNOWN_HOSTS: '/config/known_hosts' FILE: LOCAL_DIRECTORY: 'main/assets/' REMOTE_DIRECTORY: '/home/username/directory/directory/main/assets/' FILE_NAMES: ONE: 'image_one.png' TWO: 'image_two.png' THREE: 'image_three.png' WAIT_PERIOD: &waitperiod "10" |
The configuration file is in plain text, the values are pretty self-explanatory. We will talk about the KEY_FILENAME and KNOWN_HOSTS later on.
Make sure to specify your details, including HOSTNAME, USERNAME, LOCAL_DIRECTORY, REMOTE_DIRECTORY, etc…
config
The init file will be used to load the .yaml configuration file. It is handy way of being able to store the sFTP configuration parameters without having to change the main parts of the code.
1 2 |
import pathlib, sys from box import Box |
We need three libraries the standard pathlib and sys, Additionally I use python-box to help referencing values in the configuration. Python-box has a dependency on yaml parser. I use ruamel, finding it easier and quicker.
If not already installed, from your command line you can use pip to install them.
1 2 |
pip install python-box pip install ruamel.yaml |
File: __init__.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# ---------------------------------------------------------------------------- # load configuration # ---------------------------------------------------------------------------- def load_config(file_name: str) -> Box: path = pathlib.Path(__file__).parent / file_name try: with path.open(mode="r") as config_file: return Box(frozen_box=True).from_yaml(config_file.read()) except FileNotFoundError as e: print(f"The directory {path.name} does not exist \n\n {e}") sys.exit(1) except PermissionError as e: print(f"Permission denied to access the directory {path.name}\n\n {e}") sys.exit(1) except OSError as e: print(f"An OS error occurred: {e}") sys.exit(1) finally: path = None |
The function load_config will take a file name as an argument and return the python-box object.
Next we get the path of the file and store it in the path variable.
1 2 3 |
try: with path.open(mode="r") as config_file: return Box(frozen_box=True).from_yaml(config_file.read()) |
Now we will open the file, and convert the .yaml file into a Box object to be used by the FileUpload class later on.
1 |
sftp_config:Box = load_config("sftp_config.yaml") |
And last step, we call the load_config function.
File: file_upload.py
The FileUpload class will be used to connect to the sFTP server and than upload files. Additional methods could be added to download files, I will leave that exercise to you.
I am going to use the paramiko package. Again if not installed, from the command line use pip to install.
1 2 |
pip install paramiko |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
# -- LIBRARY ----------------------------------------------------------------- import os from pathlib import Path import paramiko as pa # -- LOCAL FILES ------------------------------------------------------------- from config import sftp_config class FileUpload(): def __init__(self) -> None: self._file_list: list = None self._ssh_client: pa.SSHClient = None self._sftp_client: pa.SFTPClient = None self._known_hosts_file_path: Path = None # set the file paths self.__set_file_paths() def __set_file_paths(self) -> None: self._known_hosts_file_path = os.path.normpath("".join([os.getcwd(),sftp_config.SERVER.KNOWN_HOSTS])) if os.path.isfile(self._known_hosts_file_path) == False: raise Exception(f"Known_Hosts file can't be found: {self._known_hosts_file_path}") def __connect_client(self) -> None: # get the ssh key sftp_key = pa.RSAKey.from_private_key_file(sftp_config.SERVER.KEY_FILENAME) # set the ssh client self._ssh_client = pa.SSHClient() self._ssh_client.load_host_keys(self._known_hosts_file_path) # connect to server self._ssh_client.connect(hostname=sftp_config.SERVER.HOSTNAME, username=sftp_config.SERVER.USERNAME, pkey=sftp_key) def send_files(self, file_list:list) -> None: # connect to ssh client self.__connect_client() # init the sftp client self._sftp_client = self._ssh_client.open_sftp() self._sftp_client.chdir(sftp_config.FILE.REMOTE_DIRECTORY) # cycle through each file and put them on server for file in file_list: local_file_path = os.path.join(file) remote_file_path = "".join([sftp_config.FILE.REMOTE_DIRECTORY, os.path.basename(file)]) try: self._sftp_client.put(localpath=local_file_path, remotepath=remote_file_path) except FileNotFoundError as err: print(f"File {local_file_path} not found locally") self._sftp_client.close() self._ssh_client.close() |
1 2 3 4 5 6 7 8 |
# -- LIBRARY ----------------------------------------------------------------- import os from pathlib import Path import paramiko as pa # -- LOCAL FILES ------------------------------------------------------------- from config import sftp_config |
Here we will import in standard libraries and the sftp_config file we created earlier.
1 2 3 4 5 6 7 8 9 |
class FileUpload(): def __init__(self) -> None: self._file_list: list = None self._ssh_client: pa.SSHClient = None self._sftp_client: pa.SFTPClient = None self._known_hosts_file_path: Path = None # set the file paths self.__set_file_paths() |
Here we are defining the class, and setting some variables to hold, the ssh client, sftp client, and known_hosts file_path.
__set_file_paths
1 2 3 4 5 |
def __set_file_paths(self) -> None: self._known_hosts_file_path = os.path.normpath("".join([os.getcwd(),sftp_config.SERVER.KNOWN_HOSTS])) if os.path.isfile(self._known_hosts_file_path) == False: raise Exception(f"Known_Hosts file can't be found: {self._known_hosts_file_path}") |
Now we will read the contents of the known_hosts file. If the file doesn’t exist, we will raise an Exception.
__connect_client
When we connect to the ssh server, we will need a key. The user private key should be stored locally on your server. The key needs to be in the .pem file format. The private key should be created on the ssh_server or ask the ssh server’s admin to send one to you.
1 2 |
$ ssh-keygen -b 4096 $ cat .ssh/id_rsa.pub >> .ssh/authorized_keys |
Then copy the .pem file to your local machine via FileZilla or another sftp tool.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
def __connect_client(self) -> None: # get the ssh key sftp_key = pa.RSAKey.from_private_key_file(sftp_config.SERVER.KEY_FILENAME) # set the ssh client self._ssh_client = pa.SSHClient() self._ssh_client.load_host_keys(self._known_hosts_file_path) # connect to server self._ssh_client.connect(hostname=sftp_config.SERVER.HOSTNAME, username=sftp_config.SERVER.USERNAME, pkey=sftp_key) |
We use paramiko function to read the key from the file.
Next we connect to the client and store the client object in self._ssh_client.
Know we need to load a list of known hosts to avoid a connection error.
Lastly we connect to the ssh server.
known_hosts file
One of the more difficult issues I saw with using the paramiko package was many folks were struggling with the error generated when there was not a known hosts file. Often the “solution” proposed by others was to use following line of code.
1 2 |
# auto add to known hosts self._ssh_client.set_missing_host_key_policy(pa.AutoAddPolicy()) |
Using the option will leave you open to man-in-the-middle attacks. The proper way is to use a known_hosts file.
The file is a simple text file, with no file extension. You can have one if you want but… why?
Your known_hosts file should look like the following.
1 2 3 |
MYSERVER-ONE ssh-rsa AAAAB3NzaC.....= MYSERVER-ONE ecdsa-sha2-nistp256 AAAAE2VjZHN.....= MYSERVER-ONE ssh-ed25519 AAAAC3.../O |
With the “….” replaced with many more characters. You should get the known hosts from the ssh server admin. However you can generate it if you are positive it is secure.
1 |
$ ssh-keyscan MYSERVER-ONE |
send_files
Here is the method for sending the files.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
def send_files(self, file_list:list) -> None: # connect to ssh client self.__connect_client() # init the sftp client self._sftp_client = self._ssh_client.open_sftp() self._sftp_client.chdir(sftp_config.FILE.REMOTE_DIRECTORY) # cycle through each file and put them on server for file in file_list: local_file_path = os.path.join(file) remote_file_path = "".join([sftp_config.FILE.REMOTE_DIRECTORY, os.path.basename(file)]) try: self._sftp_client.put(localpath=local_file_path, remotepath=remote_file_path) except FileNotFoundError as err: print(f"File {local_file_path} not found locally") self._sftp_client.close() self._ssh_client.close() |
1 2 3 4 5 6 7 |
# connect to ssh client self.__connect_client() # init the sftp client self._sftp_client = self._ssh_client.open_sftp() self._sftp_client.chdir(sftp_config.FILE.REMOTE_DIRECTORY) |
First call the private method to connect to the ssh_client.
Next we set the sftp client object.
Next we change the directory to the remote directory we set in the configuration file earlier.
1 2 3 4 5 6 7 8 9 10 |
# cycle through each file and put them on server for file in file_list: local_file_path = os.path.join(file) remote_file_path = "".join([sftp_config.FILE.REMOTE_DIRECTORY, os.path.basename(file)]) try: self._sftp_client.put(localpath=local_file_path, remotepath=remote_file_path) except FileNotFoundError as err: print(f"File {local_file_path} not found locally") |
Now we have a simple loop, that will read the name of each file in the file_list, and try to upload (PUT) the file to the sFTP server.
An exception will be raised if the local file path is invalid.
1 2 |
self._sftp_client.close() self._ssh_client.close() |
And at the end we close the sftp_client and ssh_client.
That’s it. Connecting to an sFTP server.