Wednesday, April 25, 2012

Blocking versus non-blocking sockets

In an operating system sockets are treated like file-handles that you read from and write to. You can read in blocking mode, which is normal, or non-blocking mode. In the code described below I assume Linux is the operating system, but the principles are the same for Windows and BSD, although the function names and constants will be different.

Blocking means that if data is not available for reading or if the device is not ready for writing then the operating system will wait on a request to read from or write to a socket until it either gets or sends the data or times out. In other words the program may halt at that point for quite some time if it can't proceed.

Non-blocking means that the request to read or write on a socket returns immediately whether or not it was successful, in other words, asynchronously. It is the task of the programmer then to decide what to do next: to try again or consider the read/write operation complete. Non-blocking is usually much faster but is a bit more complex to set up and manage.

The process of sending and receiving data over a socket is the same in both the blocking and non-blocking cases. There are five steps:

  1. Creating the socket
  2. Binding it to a local IP-address and port
  3. Connecting it to a remote IP-address and port
  4. Writing data over the connection
  5. Reading the response

Steps 3,4 and 5 may involve sending or receiving packets of data, and so may be performed in either blocking or non-blocking mode.

Creating a socket

In order to send data over an IP connection you have to decide whether the transmission will use IPv4 or IPv6. This will determine the template of the packets that will be sent. A socket is an endpoint of communication. The remote machine to which we will connect also sets up a socket and reads and writes to its remote socket as it listens in for requests from our socket. But we only need one socket to both read and write.

We also have to declare what sort of IP communication we will be carrying out: TCP or UDP. The former needs the 3-way TCP handshake first to establish a 'connection'. UDP does not, so a request to connect on a UDP socket doesn't send any packets. Let's say we want a standard IPv4, TCP socket. Our code will look like this:

#include <stdio.h>
#include <sys/socket.h>
int sock = socket( AF_INET, SOCK_STREAM, 0 );
if ( sock != -1 )
{...}
else
    printf("couldn't create a socket\n");

The constant AF_INET means that we want an IPv4 socket and SOCK_STREAM declares that it should be a TCP connection. The last argument is normally 0 for the 'protocol', which means that the operating system should choose the default. The return value is -1 if it fails, otherwise it will be an integer - usually a small one - which is the identifier of the socket.

Binding to a local IP-address and port

Before it can be used to send IP-packets the socket has to be 'bound' to a local IP-address and local port. The local port usually doesn't matter, but it is put into the IP-header because it is the port to which the remote application will send its replies. Usually we just specify 0 and the operating system will choose a free port for us. More importantly we must choose a valid local IP-address. This can be the default IP-address of some interface such as localhost (127.0.0.1) or that of any other interface, or even an alias of an interface's main address. So a bind call on localhost looks like this:

struct sockaddr_in addr;
addr.sin_family = AF_INET;
/* use a random port as the socket's source port */
addr.sin_port = 0;
/* load the address of localhost as the socket's source address */
int res = inet_pton( AF_INET, "127.0.0.1", &addr.sin_addr );
if ( res != 1 )
    printf("inet_pton error %s\n",strerror(errno));
else
{
    res = bind( sock, (const struct sockaddr *)&addr,sizeof(addr));
    if ( res != -1 )
    {
        printf("bound socket %d to 127.0.0.1 \n",sock);
        ...
    }
    else
    {
        printf("failed to bind to 127.0.0.1\n");
    }
}

The sockaddr_in structure is for IPv4 connections. Note that bind expects a generic struct sockaddr pointer, which could be an IPv6 address. So we have to cast our IPv4 structure to the generic type. We set the IPv4 address in the structure to 127.0.0.1 via a call to inet_pton. This just encodes the four numbers expressed as a string into four integers in network byte order for us.

Connecting

In TCP we have to first establish a connection by sending a SYN packet. The server then replies with SYN-ACK, and the client answers with an ACK. All this is sent via the connect function. If the socket is not already bound explicitly to an ip-address and port (i.e. if we didn't call bind) then connect will bind it for us to the default interface's default IP-address and some random port. Usually we want to control that, though. So the connect call looks like this:

int do_connect( int sock, char *host, char *port )
{
    struct sockaddr_in addr;
    /* clear addr structure first */
    memset( &addr, 0, sizeof(addr) );
    /* reuse addr structure to connect to host and port */
    int res = inet_pton( AF_INET,host,&addr.sin_addr);
    if ( res == 1 )
    {
        /* port number must be in network byte order */
        addr.sin_port = htons(atoi(port));
        /* establish TCP connection via handshake (SYN,SYN-ACK,ACK) */
        res = connect(sock,(const struct sockaddr *)&addr, sizeof(addr));
        if ( res == 0 )
        {
            printf("connected successfully to %s on port %s\n",host,port);
            return 1;
        }
        else
            printf("couldn't connect to %s on port %s\n",host,port);
    }
    else
        printf("inet_pton failed: %s\n",strerror(errno) );
    return 0;
}

Apart from the socket, the two parameters are host, which is the IP-address of the remote server we want to connect to, and port, which is the port we want to connect on. This time 'port' has to be a real port. A random one won't do. The functions htons and atoi just turn the string representation of port into the correct numerical form. So if we wanted to connect to the BBC web-server the value of host would be 212.58.244.66 and the port 80. We reuse the same addr structure, but reset the values to what we want in this case. We return 1 on success and 0 on failure. If successful, our socket is connected and can start sending and receiving data on it.

Sending data

static ssize_t writen( int sock, const void *vptr, size_t n )
{
    size_t nleft;
    ssize_t nwritten;
    const char *ptr;
    ptr = vptr;
    nleft = n;
    while ( nleft > 0 )
    {
        if ((nwritten = write(sock,ptr,nleft)) <= 0 )
        {
            if ( errno == EINTR )
                nwritten = 0;
            else
                return -1;
        }
        nleft -= nwritten;
        ptr += nwritten;
    }
    return n;
}

This function writes an arbitrary amount of data to the socket we connected in the previous step. We may not be able to send all the data in one go, so the writen function keeps looping until it is all sent. The write function also works for files and blocks by default. So if the buffer to write to isn't ready, because the connection is down or slow, then it will wait. The test for the EINTR (interrupt) error continues in case write returns -1 in that case. The function will continue until it has written all the data.

Reading the response

If we sent the server a message like a HTTP GET call, we will want to receive the reply on the same local port we encoded into the packets we sent by the call to writen. So we just call the read function and loop until read returns 0:

static int read_blocking( int sock )
{
    int n,total = 0;
    for ( ; ; )
    {
        n=read( sock, line, MAXLINE );
        if ( n < 0 )
        {
            total = -1;
            printf( "failed to read. err=%s socket=%d\n",
               strerror(errno),sock);
            break;
        }
        else if ( n == 0 )
        {
            // just finished reading
            break;
        }
        else
            total += n;
    }
    return total;
}

line is just a buffer we fill with the response, of length MAXLINE. Note that in this simple function we just throw away the data, and only read it one MAXLINE chunk at a time.

Non-blocking In/out

Three of those calls send data: connect (the TCP handshake), write and read. Each may block. So to make the process non-blocking we have to remember which of those three states we are in so we know what to do next. We start in the connect state, and when that has completed we move to writing, and when that has finished we can move to read. But first we have to change the socket so that it returns immediately on a call to connect, write or read:

int make_nonblocking( int sock )
{
    /* get existing socket flags */
    int flags = fcntl (sock, F_GETFL, 0 );
    /* switch socket to non-blocking mode */
    int res = fcntl( sock, F_SETFL, flags | O_NONBLOCK );
    if ( res == -1 )
    {
        printf("failed to make socket %d non-blocking\n",sock);
        return 0;
    }
    else
        return 1;

Here we use the fctl function (file control) to change the file-handle, aka socket, to non-blocking mode. But first we must get the current state of the socket in case there were other settings. We add the 'non-blocking' flag (O_NONBLOCK) by logically ORing it to the current flags (flags) and the socket's behaviour will be changed. Again, we must remember to test for an error.

Non-blocking connect, write, read

Converting the blocking in/out to non-blocking involves writing a simple finite state machine. For each state we will call try_something to try to complete that state. if it succeeds we move to the next state.

int sendnb( char **argv )
{
    int res;
    int sock = tcp_bind( 0, "127.0.0.1" );
    if ( sock != -1 )
    {
        do
        {
            switch ( state )
            {
                case initial:
                    res = do_connect( sock, argv[1], argv[2] );
                    if ( res )
                        state = writing;
                    break;
                case connecting:
                    res = try_connect( sock );
                    if ( res )
                        state = writing;
                    break;
                case writing:
                    res = try_writen( sock );
                    if ( res )
                        state = reading;
                    break;
                case reading:
                    res = try_readn( sock );
                    if ( res )
                        state = done;
                    break;
            }
        } 
        while ( state != done && state != error );
        close( sock );
        if ( state == done )
            return 1;
    }
    return 0;
}

Let's take them one at a time.

Establishing a connection asynchronously

We call ordinary blocking do_connect, except that, since the socket itself has been made non-blocking, we will probably fail with errno EINPROGRESS. That is normal, and we stay in the connecting state. On subsequent calls we must test if the pending connection was made, and not call do_connect again. This means we call poll on the socket to see if it is ready for writing:

int try_connect( int sock )
{
    struct pollfd fds[1];
    fds[0].fd = sock;
    fds[0].events = POLLWRBAND | POLLOUT;

    int res = poll( fds, 1, POLL_TIMEOUT_MSECS );
    if ( res == 1 )
    {
        return 1;
    }
    else if ( res == -1 )
    {
        state = error;
    }
    return 0;
}

The timeout parameter to poll can be 0 but we set it to 5 milliseconds just so we don't keep calling it over an over. Poll works by asking it to test for readiness of some state, such as being ready for output (POLLOUT).

Writing asynchronously

When writing we must cover the case that not all the writing can be carried out without blocking. Then we return immediately and call poll next time. Otherwise this routine is the same as the blocking I/O.

int try_writen( int sock )
{
  ssize_t nwritten;
 struct pollfd fds[1];
 fds[0].events |= POLLOUT;
 fds[0].events |= POLLWRBAND;
 int res = poll(fds, 1, POLL_TIMEOUT_MSECS);
 if ( res > 0 )
 {
  int nleft = message_len-message_pos;
  while ( nleft > 0 )
  {
      if ((nwritten = write(sock,&message[message_pos],nleft)) <= 0 )
      {
          if ( errno != EINTR || errno != EAGAIN )
     state = error;
          return 0;
      }
      nleft -= nwritten;
      if ( nleft > 0 )
    message_pos += nwritten;
   else
   {
    message_pos = 0;
    return 1;
   }
  }
 }
 else if ( res < 0 )
 {
  printf( "error: %s\n", strerror(errno) );
  state = error;
 }
 return 0;
}

Reading asynchronously

Reading asynchronously is similar to synchronous read, except that we must cover the case where errno is EAGAIN. Then we return immediately as for write and call poll again next time.

int try_readn( int sock )
{
    int n,total = 0;    
    struct pollfd fds[1];
    fds[0].events |= POLLIN;
    fds[0].events |= POLLPRI;
    int res = poll(fds, 1, POLL_TIMEOUT_MSECS);
    if ( res > 0 )
    {
        for ( ; ; )
        {
            n=read( sock, line, MAXLINE );
            if ( n < 0 )
            {
                if ( errno != EINTR || errno != EAGAIN )
                {
                    state = error;
                        printf("error: %s\n",strerror(errno));
                }
                return 0;
            }
            else if ( n == 0 )
            {
                // just finished reading
                break;
            }
            else
                total += n;
        }
    }
    else if ( res < 0 )
    {
        printf("error: %s\n",strerror(errno));
        state = error;
    }
    return total;
}

Now we're done. Here's the complete test code. Enjoy, but no guarantees it works perfectly.

Thursday, April 19, 2012

Syncing program changes to other machines

Once you've set up a multi-machine attack it's great to able to invoke it and see the results on the target. However, what if you need to modify the software? It's spread over 7 machines and any changes will have to be recompiled on each attacker. Ouch! So my idea was to write another script to not only use rsync to sync the local changes but also to recompile the correct module on each machine. Luckily the structure of botloader makes this easy. The master directory has an install.sh script, then each of the sub-folders has another script rebuild.sh, which is just invoked by install.sh. So the sync-changes script takes one argument, the name of the subfolder, and then does all the hard work automatically. Of course you must first set up passwordless login for this to run smoothly:

Wednesday, April 18, 2012

Turning off avahi-daemon

During our experiments some of the attackers perform poorly because they are running avahi-daemon. This is a service that looks for services on the network while the computer is idle. It can consume all the CPU on that machine. To turn it off you should use chkconfig rather than try to deinstall it, which might have unforeseen consequences (you may need to install chkconfig):

sudo chkconfig -s avahi-daemon off

or

systemctl disable avahi-daemon.service

Then kill it if it is still running:

sudo service avahi-daemon stop

Thursday, April 12, 2012

Running programs on remote machines via ssh

We wanted to run our botloader program simultaneously on 7 machines, all attacking a single host. The idea was to recreate a large-scale flash event: the FIFA 1998 world cup semi-final. We needed 70,000 users, so each of the 7 machines would have to generate traffic from 10,000 hosts. We used botloader for that, but synchronising them all using a script proved hard to achieve.

The trick is to set up passwordless login as root from one master machine to all the machines participating in the attack. Then on the master machine we used rsync to synchronise the configuration file for botloader on each attacker with the copy on the master. Finally we ssh login to each attacker and issue the botloader command:


Putting the command at the end executes it remotely. You have to redirect the output or you won't get any error messages: the ssh session will hang as it tries to return the output to your local terminal. So you have to redirect stdin, stdout and stderr, and also background the process (that's the & at the end). The "nohup" ensures that the invocation of the command persists after you logout.

Monday, February 13, 2012

Enabling SSL on Apache2 for Testing

Enabling SSL on an apache2 installation is easy. There are plenty of instructions on the Web for doing this, but I thought I'd describe the way to do it using the latest Ubuntu installation of apache2, which is idiosyncratic.

  1. First you need to generate a self-signed certificate. I used the following command:
    openssl req -new -x509 -nodes -out server.crt -keyout server.key
    Now create a directory for these files inside your apache2 installation:
    sudo mkdir /etc/apache2/certs/
    And move the certificates to that location:
    sudo mv server.* /etc/apache2/certs/
  2. Next, edit /etc/apache2/sites-available/default-ssl, and change the two directives:
    SSLCertificateFile    /etc/apache2/certs/server.crt
    SSLCertificateKeyFile /etc/apache2/certs/server.key
    
    So they now point to your files.
  3. Now enable the ssl module in apache2, and the default ssl site:
    sudo a2enmod ssl
    sudo a2ensite default-ssl
  4. Finally, restart apache2:
    /etc/init.d/apache2 restart
    And it should work. Test it by going to https://localhost in the browser. It should give you a dialog complaining about how insecure this is. Just say that you understand the risks, enable the exception, and it will take you to the index.html page.

Setting up Joomla! as a test website

Joomla! has quite a lot of sample data that is useful for HTTP stress-testing. It provides a variety of resources that can be parsed and retrieved by the http_bot of botloader. Setup is easy, but I thought I'd put it on record so people can follow the installation quickly.

  1. On Ubuntu and other versions of Linux you'll need to install Mysql, php5 and apache2. To get the php to work you'll need the apache php module (libapache2-mod-php5). For the mysql you'll need to install both the server and client. When you install mysql server it asks for a user name and password. Use "root" and give any password, but remember it, because you'll need it later.
  2. Now download Joomla!. Unzip the files and rename the directory to "joomla". Now copy the joomla directory to /var/www or wherever your web-root is located:
    cp -r joomla /var/www
    Now make sure that the installer can modify the joomla directory. cd into /var/www and type:
    sudo chown -R www-data joomla
    At least on Ubuntu 'www-data' is the name of the user who is running apache. Or you can use chmod -R +w joomla if you prefer, but that's a lot less secure, though it doesn't matter on a local testbed.
  3. Now edit index.html, which you'll find in /var/www, and add a line somewhere in the body of the HTML:
    <p>Why not visit our wonderful <a href="/joomla/">Joomla! site</a>?</p>
    This provides a link into the main sample data which http_bot will follow when attacking the site. Otherwise it will only find index.html, and keep downloading that - pretty ineffective. So this step is important
  4. Now run the Joomla! installer. It's located at http://localhost/joomla/. Click through the pages, making sure that it detects Mysql.
    • If it says that the installation directory is unwritable, try the chmod -R +w joomla command from /var/www.
    • If it says that mysql is undetectable you need to install something - check that you have the mysql plugin for Apache.
    • When it asks for the mysql username and password give the ones you specified above in step 1.
    • When it asks if you want to install sample data, say YES.
    • For the rest, just follow the suggested options
  5. Now test the installation. Navigate to http://localhost, click on the link you created earlier and make sure that the website is all working. If it says "downloading" when you access a php page, you must have failed to install php5 correctly.

Sunday, February 12, 2012

Measure CPU usage on Windows with PDH

On Windows to measure the CPU usage you need to use the Performance Data Helper (PDH). Microsoft distribute a dll (pdh.dll). Although not ideal you can link with this. Pdh.lib would be better but I have no idea where to get it. I'm going to describe how to measure total CPU usage using pdh.dll, and three headers, windows.h, pdh.h and pdhmsg.h. The latter two come with the Microsoft SDK. The tools I used were the MinGW tool set in NetBeans.

You have to do five things:

  1. Create a query using PdhOpenQuery.
  2. Add a counter to it via PdhAddCounter
  3. Call PdhCollectQueryData twice on the query, sleeping in between
  4. Call PdhGetFormattedCounterValue on the counter
  5. Close the query

Now you should have a formatted data value you can display using printf or whatever. Here's a minimal program that does it (proper error-handling is left as an exercise for the reader):

PDH_HQUERY query;
PDH_STATUS status = PdhOpenQuery(NULL,(DWORD_PTR)NULL,&query);
if (status==ERROR_SUCCESS )
{
  HCOUNTER hCounter;
  status = PdhAddCounter( query, 
    "\Processor(_Total)\% Processor Time", 0, &hCounter );
    if (status==ERROR_SUCCESS)
    {
      status = PdhCollectQueryData(query);
      if (status==ERROR_SUCCESS)
      {
        Sleep(1000);
        status = PdhCollectQueryData(query);
        DWORD ret;
        PDH_FMT_COUNTERVALUE value;
        status = PdhGetFormattedCounterValue(hCounter, 
          PDH_FMT_DOUBLE|PDH_FMT_NOCAP100,&ret,&value);
        if (status==ERROR_SUCCESS)
        {
          printf("CPU Total usage: %2.2f\n",value.doubleValue);
        }
        else
          printf("error\n");
      }
      else
        printf("error\n");
    }
    else
      printf("error\n");
    PdhCloseQuery( query );
  }
  else
    printf("error\n");
}