= (sustained SSH during MPICH computing)

I've been losing SSH connection after starting process jobs on my new beowulf cluster.  This is my current fix, since my theory is that the network switch is so clogged with MPI-related communication (which does take place via ssh) that there's no bandwidth left for my administrative SSH connection.  Theory supported by observation that when I plug my unrelated control machine into the switch it can't ping google.

assumptions

  • Ubuntu 16.04.1 LTS
  • working wireless and ethernet: I had to do this and this.

sources

https://web.archive.org/web/20140307210607/http://www.themagpi.com/issue/issue-11/article/turn-your-raspberry-pi-into-a-wireless-access-point
http://askubuntu.com/a/180734
https://seravo.fi/2014/create-wireless-access-point-hostapd

check wifi hardware capability

Run the command 
iw list
And look for a section like the following.  If it includes 'AP' (see emboldened bit), you're golden.  If not, look for a different wireless card.
Supported interface modes:  
* IBSS
* managed
* AP
* AP/VLAN
* monitor

install dependencies

sudo apt-get install rfkill hostapd hostap-utils iw dnsmasq   

identify interface names

As of ubuntu 16.04, the standard wlan0 and eth0 interface names are no longer in use.  You'll have to identify them specifically.  Use the following command, which lists the contents of the folder for each interface device, and look for the device that has a folder named 'wireless'. src.
ls /sys/class/net/*
Observe the assumptions above to see what I'm calling them.

configure wifi settings

There's three files you'll have to configure.  Since I'm logged in via ssh, I don't want to interrupt my connection until I've created a new access point I can connect to.  So I'll walk through editing each file in turn, then I'll have one command at the end that activates all the changes.  

configure wireless interface: /etc/network/interfaces

Backup your current interface file.
sudo cp /etc/network/interfaces /etc/network/interfaces.bak
and then edit the original
sudo nano /etc/network/interfaces
replace the contents of the file - change the interface names as appropriate.
auto lo
iface lo inet loopback

auto enp2s0
iface enp2s0 inet dhcp

auto wlp1s0
iface wlp1s0 inet static
hostapd /etc/hostapd/hostapd.conf
address 192.168.3.14
netmask 255.255.255.0
Normally I'd say that here's where you restart the interface, but we're saving that for the end.

configure the access point: /etc/hostapd/hostapd.conf

backup the original file - it's ok if there's nothing there.
sudo cp /etc/hostapd/hostapd.conf /etc/hostapd/hostapd.conf.bak
edit the original
sudo nano /etc/hostapd/hostapd.conf
put this in:
interface=wlp1s0
driver=nl80211
ssid=test
hw_mode=g
channel=1
macaddr_acl=0
auth_algs=1
ignore_broadcast_ssid=0
wpa=3
wpa_passphrase=1234567890
wpa_key_mgmt=WPA-PSK
wpa_pairwise=TKIP
rsn_pairwise=CCMP
Inexplicably, this only seems to produce a detectable wifi access point when the ssid is 'test'.  I tried several other non-keyword names, and none of them worked.  Go back to 'test', and it worked.  Did it several times...magic.

Save and exit.

configure the DHCP server

this is where the access point actually becomes something you can access. backup: 
sudo cp /etc/dnsmasq.conf /etc/dnsmasq.conf.bak
edit original - since the file is so big, I rm'd the original and pasted the contents below into an empty file.
sudo rm /etc/dnsmasq.conf
sudo nano /etc/dnsmasq.conf
make it look like this:
# Never forward plain names (without a #dot or domain part)
domain-needed

# Only listen for DHCP on wlan0
interface=wlp1s0

# create a domain if you want, comment #it out otherwise
# domain=Pi-Point.co.uk

# Create a dhcp range on your /24 wlp1s0 #network with 12 hour lease time
dhcp-range=192.168.3.15,192.168.3.254, 255.255.255.0,12h
Save and exit.

implement changes

this is going to be one big command. if it works, you're in business...if it doesn't, you'll have to login directly to the machine for troubleshooting.
sudo ifdown wlp1s0; sudo ifup wlp1s0; sudo service hostapd restart; sudo service dnsmasq restart
Worked for me: I now have a secondary wireless access to my beowulf cluster for when the ethernet gets clogged with MPI signals.