Ruby Net::FTP & Extended Passive Mode
Interacting with FTP servers programatically is sometimes a laborious process. Fortunately, Ruby’s standard library makes this a breeze. Somtimes, things are not so easy.
Net::FTP is cribbed almost directly from Perl, and as a result is a mature library that makes most standard tasks (e.g. query for a list of files, fetch a specified file) easy.
However, in the world of FTP the enemy of all good connections is the firewall. This has resulted in the creation of passive mode in the specification. A complete description of what passive mode is can be found here, but the gist is that in passive mode the client initates both the control and data channels (as opposed to active mode where the data channel is established by the server, after recieving what port the client is listening on).
Passive Mode (PASV)
Ruby allows us to use passive mode for FTP connections with a simple method, as seen in the example below:
1 2 3 4 5 6 7 8
Here, we create a new FTP connection to a server with passive mode active, and ask for a list of files on the server. In the terminal, we would expect an output like the following:
1 2 3 4 5 6 7
But oh no! Something has gone wrong. We got this instead:
1 2 3 4 5
Timed out? But we were just connected!
Extended Passive Mode (EPSV)
FTP was devised in the 1970s, long before NAT (Network Address Translation) and firewalls were a thing, let alone widespread. Passive mode was added to the FTP specification in order to deal with this problem. However, some servers will only play nice with extended passive mode, which was added in to the specification in 1998 (to encompass IPv6 addresses). The reason that EPSV works for some servers when PASV fails is due to routers/firewalls tainting FTP traffic that is being transmitted under PASV.
So how do we get around this? Net::FTP is relatively old, and doesn’t have a built in method for enforcing it. What we can do is overwrite our
makepasv in Net::FTP to enforce this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Here, we check if the address family is
AF_INET (a Berkeley socket for IPv4) in the 0th index of an array that contains the results of a reverse lookup of the address we are connecting to, which includes the address family, port, hostname and numeric (IP) address.
If it is, we set our host and port as the result of calling
parse229 on the response recieved from sending the
EPSV command to the server.
If we append the above code to our test, it successfully lists the contents of the directory.
Real World Usage
The problem with this approach is that it is not appropriate to append it to all of the usages of Net::FTP in the codebase through adding it manually, or through a monkey-patch (in Rails). This is because some servers do not support EPSV, but do support PASV, so we need to call this overwritten method as and when.
How this accomplished is up to the reader, as some solutions will be more appropriate for some use cases.
This is not a definitive solution to the problem (as ideally the intervening firewall would be correctly configured), but if you are frequently getting data from many different FTPs, this can come in handy.