CS244 ’15: Mosh | Reproducing Network Research Results


Team: Kevin McKenzie and Patrick Harvey

Introduction

Mosh: An Interactive Remote Shell for Mobile Clients was developed by Keith Winstein and Hari Balakrishnan of MIT. The original paper can be found here: http://mosh.mit.edu/mosh-paper.pdf

Mosh is an interactive shell that can be operated remotely from a variety of clients, including end-user PCs and mobile devices. As stated in the paper, Mosh “is a remote terminal application that supports intermittent connectivity, allows roaming, and speculatively and safely echoes user keystrokes for better interactive response over high-latency paths” [2]. It uses the new State Synchronization Protocol (SSP) to synchronize client and server state even as a user’s location and IP address changes.

A competing protocol for Mosh is traditional SSH.  SSH does not allow for changes to IP address and does not perform well over high-latency paths.  The authors demonstrated that SSH echos keystrokes with a median latency of 503 ms over 3G connections.  Research by Card et al. in 1991 showed that response times of 100 ms or less were vital to the user for a fluid-feeling response [1].  Mosh introduces a predictive local echo that anticipates the echo for most keystrokes (like alphanumeric characters typed into a text editor) and displays them as soon as possible.  This significantly reduces the median time of response to 5 ms.

Motivation

Mosh is a massive improvement over SSH for mobile clients and laptops.  It is incredibly frustrating to lose your SSH connection each time you close your laptop, move to a different cell phone tower, or switch between 4G and Wifi.  The predictive local echo ensures a smooth typing experience, even over flaky connections.  Each of these improvements have caused us to switch to using Mosh for most remote work.  Our goal is to validate the findings from the original Mosh paper to prove the benefits of such as system.

Results

The Mosh authors proved two major accomplishments in their paper.  First, they proved that Mosh would allow for roaming while maintaining a consistent connection to the remote host.  They showed that Mosh will keep a connection alive even while switching between technologies (Wifi to 3G, etc), switching IP addresses, and resuming a connection after waking up your device.

In addition, the authors proved that Mosh performs much better than SSH under high-latency conditions.  The application uses predictive local echo to keep the mean response time below the crucial 100 ms mark for a fluid connection, even when remote keystrokes take upwards of 500 ms to return to the device.  The authors produced Figure 2 (below), validating their results that Mosh is significantly better than SSH for high-latency connections.

Subset Goal

The authors produced the following graph during the keystroke latency test.  They found that the median and mean response times using Mosh were 5 ms and 173 ms, respectively.  Furthermore, they found that the median and mean response times using SSH were 503 ms and 515 ms, respectively.  Each of these numbers is up to 5 times the minimum latency required to a fluid-feeling response.  Mosh performed much better, with at least 70% of keystrokes with less than 10ms response time.

figure_2

Subset Motivation

We aimed to reproduce this result, as it is one of the two major accomplishments in Mosh.  The first is continued operation while roaming, which is trivially verified by running a Mosh connection and switching from Wifi to Mobile broadband.  Figure 2 (above) is more interesting to validate because it demonstrates the second major accomplishment: local keystroke echoing.  This technology allows 70% of keystrokes to have a response time of less than 10 ms, even with a high-latency network.  This is a massive improvement over SSH, so validating the result will prove Mosh’s superiority over the older protocol for mobile networks.

Subset Results

We simulated a client-server connection in Mininet using two hosts and a link between them.  We gave the link properties of a 3G connection that we gathered from real-world test data on T-Mobile’s network.  The client would initiate a connection using either SSH or Mosh, then use the author’s original terminal keystroke replay scripts to send keystrokes to and from the server.  The graph below shows the recorded response times for Mosh and SSH over a 3G connection.

3G

The graph closely matches the findings of the original authors.  The only difference is that the 3G connection latency appears to be smaller than the latency found in the original paper.  This is expected, as the 3G connection the original authors used was Sprint in Massachusetts in 2011-2012.  Our connection statistics are from May 2015 in Stanford, CA on T-Mobile.  Still, aside from the measured latency, our results perfectly match the findings of the original authors.  Furthermore, the median response time for Mosh is less than 1/5th of the response time that the original authors found, but this is expected since the predictive local echo is CPU bound and we used powerful Amazon EC2 servers for our simulation.

Challenges

One of the primary challenges we faced was getting the terminal replay system used in the paper to operate correctly when running over Mosh in Mininet. The terminal replay system was essentially undocumented, and minor errors in command formatting could cause mystifying behavior such as the replay server running on the same node as the client and communicating over the loopback interface instead of ssh or Mosh while appearing to produce output. Additionally, the terminal replay system would terminate replays early if the recorded session opened TUIs such as non-windowed emacs or vim, and the output was placed oddly enough that it took some time to determine this problem was not related to mosh itself. This issue we mainly just avoided, by creating a new test trace that did not open any TUIs during the recorded terminal session.  Also, the original author’s terminal replay script for the client failed to recognize the end-of-file for the terminal replay log.  As such, we had to modify the script slightly to exit gracefully once the end-of-file is reached.

The standard version of Mosh available via apt-get or other package managers also suffered from a bug involving failing assertions about terminal window dimensions only fixed in the most current source code, and even after downloading and compiling a version with this fix in place, a similar issue to this bug appeared when run within certain terminal setups (such as a display-less VM) that did not provide normal dimensions for terminals on the client end. This client-side issue appeared rather inconsistently, and we temporarily attempted minor edits to hardcode window dimensions for our trace in the mosh source for our testing, but this seems not to have been necessary except when we attempted to run the test within a strictly no-GUI VM.

Critique

The results that the authors originally found hold up very well.  The median and mean response times for keystroke latency continue to be significantly quicker.  As seen in the next section, the results hold for good and flaky 3G and 4G LTE connections as well.  It is clear that the predictive local echo and other features in Mosh significantly reduce keystroke latency up to 70-80% of the time.

Extensions

To further test the Mosh platform, we gathered real-world data for 3G, 4G, and Wifi connections.  We collected the data for each connection type using the FCC’s Speed Test App, which was designed to validate the reported speed claims of mobile broadband providers.    Here are the results for both good mobile broadband connections and flaky broadband connections:

3G3G_FLAKY    4G_LTE_FLAKY4G_LTE

WIFI

It is evident that Mosh continues to be a great application for newer mobile broadband technologies.  The Wifi test shows that Mosh performs at a similar level to SSH for keystroke latency under good conditions.

Platform

We chose to use Mininet to replicate the Mosh results.  We were able to run the author’s original keystroke replay scripts on two hosts within Mininet.  One host served as a client that used SSH or Mosh to connect to the server host.  This worked very well, but we did run into an issue.  As Mosh was run on an emulated TTY host through Mininet, it had trouble returning keystrokes for when using applications like emacs.  As such, we were forced to record simpler replay scripts where we just navigated through the file system and other performed other simple tasks.  Aside from this, Mininet was the perfect platform to validate results.

Mininet allows us to accurately emulate both good and flaky connections.  We utilized links with high latency, jitter, and packet loss to emulate flaky connections.  The exact connection statistics we used can be found in the readme in our repository, located here: https://github.com/kjtmckenzie/mosh_test/blob/master/README

Running the Experiment

Our code repository can be found here: https://github.com/kjtmckenzie/mosh_test

Setting up the test server on EC2:

  • Create an EC2 instance following the instructions listed here: http://web.stanford.edu/class/cs244/ec2setup.html
  • Select the AMI CS244-Spr15-Mininet or ami-cba48cfb on us-west-2
  • Select the c3.large instance type
  • Configure the security group to allow SSH access
  • Log into your EC2 instance and verify Mininet by running:
    • sudo mn –link tc,bw=10 –test iperf
  • For more detailed instructions, see the link above.

Once logged into the server, execute the following commands:

Running the setup.sh script will install all the relevant packages needed for the Mosh experiment. It may ask you for a “y” or “yes” to confirm the package installation.

Run the experiment:

  • sudo ./run.sh

This will take roughly 2 hours, and it will generate 5 graphs in the mosh_test directory: 3G.png, 3G_FLAKY.png, 4G_LTE.png, 4G_LTE_FLAKY.png and WIFI.png.

Citations

[1] Card, S.K., Robertson, G.G., and Mackinlay, J.D. The information visualizer: An information workspace. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (New Orleans, Apr. 28-May 2). ACM Press, New York, 1991, 181-188.

[2] Winstein, K. and Balakrishnan, H. Mosh: An Interactive Remote Shell for Mobile Clients Presented as part of the 2012 USENIX Annual Technical Conference (USENIX ATC 12) (Boston, MA). 177-182.

One response to “CS244 ’15: Mosh | Reproducing Network Research Results

  1. Hi,

    Your instructions were precise and straightforward and the code ran to completion. There was one small glitch in your post- there are single hyphens for the Mininet test commands where there should be two. Also, the code threw Mininet exceptions about inability to stop hosts similar to PA2. This didn’t seem to affect results and the code did complete correctly despite this.

    The results we obtained matched very well with yours except the median for MOSH for which we got extremely small values (less than 50% of what you got). Additionally, the Wi-Fi graph had overlapping text and numbers making it impossible to read (except that the numbers seem to be in same ballpark as what you got).

    In order to rule out EC2 glitches we tried twice on two new c3.large VMs, and got the same result each time (with small median and overlapping text for Wi-Fi). We weren’t able to explain the discrepancy based on the contents of the blog post.

    The numbers we obtained are below. Additionally we have uploaded the PNGs to Google Drive in a publicly viewable folder (link is below).

    Run 1:

    3G:
    Mean: SSH: 123.3 Mosh: 25.8
    Median: SSH: 123.4 Mosh: 0.3

    3G Flaky:
    Mean: SSH: 151.4 Mosh: 30.6
    Median: SSH: 150.6 Mosh: 0.2

    4G:
    Mean: SSH: 89.8 Mosh: 19.1
    Median: SSH: 89.6 Mosh: 0.3

    4G Flaky:
    Mean: SSH: 137.1 Mosh: 27.0
    Median: SSH: 137.7 Mosh: 0.2

    Wi-Fi:
    Mean: SSH: ?? Mosh: ??
    Median: SSH: ?? Mosh: ??

    Run 2:

    3G:
    Mean: SSH: 121.1 Mosh: 25.4
    Median: SSH: 121.3 Mosh: 0.3

    3G Flaky:
    Mean: SSH: 153.8 Mosh: 29.6
    Median: SSH: 151.3 Mosh: 0.3

    4G:
    Mean: SSH: 89.5 Mosh: 19.3
    Median: SSH: 89.5 Mosh: 0.3

    4G Flaky:
    Mean: SSH: 135.1 Mosh: 27.9
    Median: SSH: 135.8 Mosh: 0.3

    Wi-Fi:
    Mean: SSH: ?? Mosh: ??
    Median: SSH: ?? Mosh: ??

    The PNGs of the results we obtained are at are at https://drive.google.com/folderview?id=0BzMwZGqrmYUnfnFYNkxXZ1FHSGxrNncwbmhjeXZHNENtZkppVHJGZS13TGMySno3VExtM0U&usp=sharing

    The sensitivity analysis was extremely good- the post demonstrates beyond doubt that mosh improves response times on several types of links. Overall, we’d like to give this a 4.5 (mainly due to the unexpectedly small median and the Wi-Fi text issue). Great job guys!

    Thanks,
    Sudarshan S and Romil Verma

Leave a comment