TermiShell: a web-based shell for the Raspberry Pi (development notes)

Introduction

In the course of development of PiCockpit, I am going to add a web-based Terminal called TermiShell.

image

TermiShell icon, by: Stephanie Harvey via unsplash.com

TermiShell is going to allow you to log into your Raspberry Pi using PiCockpit.com (and the picockpit-client) – no additional application required on either side. This should be very comfortable, especially when on the go.

TermiShell is not going to be released in the upcoming v2.0 release of PiCockpit, because it requires additional preparation work, and would delay the upcoming release significantly. I would also rather not cut corners on security (which would be the alternative to make it work right away).

The work, however, will positively impact many other capabilities of PiCockpit as well – for instance the ability to stream video (from the Pi camera), file uploads / downloads from the Pi, and many other functionalities which require data streams.

I am compiling information for myself, and also for other interested developers about my thoughts on how to realize such a web-based terminal, and background information.

What is a pseudo-terminal (pty)?

Python offers built-in functionalities to execute processes, and to capture their output (stdout and stderr) and send them input (stdin).

Here is a first attempt with subprocess:

# dow, day of work 1.5.2020
import os
import sys
import time
import threading
import contextlib
import subprocess

print(“Hello world!”)
print(“Running omxplayer”)

def output_reader(proc):
     contFlag = True
     while contFlag:
         chars = proc.stdout.read(1)
         if chars != None:
             if chars != b”:
                 print(chars.decode(‘utf-8’), end=””, flush=True)
             else:
                 contFlag = False

proc = subprocess.Popen(
     [‘omxplayer’, ‘/home/pi/thais.mp4’],
     stdin=subprocess.PIPE,
     stdout=subprocess.PIPE,
     stderr=subprocess.PIPE,
     bufsize=0)

t = threading.Thread(target=output_reader, args=(proc,))
t.start()

sys.stdin = os.fdopen(sys.stdin.fileno(), ‘rb’, buffering=0)

while True:
     char = sys.stdin.read(1)
     print(char.decode(‘utf-8’), end=””, flush=True)
     proc.stdin.write(char)

proc.stdin.write(b’z’)

t.join()

Please note the following in the code above (which is not perfect, just demo code – and in fact might not work as expected!):

  • bufsize is set to 0 – to suppress buffering
  • I set up a thread to read the output of the process character by character
  • I set up stdin to have zero buffering. This in turn requires it to be opened in binary mode (rb)
  • I read characters one at a time from the user’s input. I echo them using flush=True (otherwise output is line-buffered in Python by default for print)
  • I write the character to the process. Remember, it is not buffered because we set up bufsize=0

Even so, we run into the following situation: output from the application (in this case omxplayer), is not received character by character, as expected – rather it is dumped all at once, at exit.

Even though buffering is set to 0. Why?

buffering & interactive processes

Linux stdio buffers. It is clever at this, too. If the process is not connected to an interactive process (terminal), but to a pipe, output is buffered until the buffer has been filled.

Then it is efficiently copied over to the other application.

This is good, resource-efficient behavior for a lot of use cases. But not if you are trying to interactively control the application.

There is also nothing you, the developer, can do to influence the behavior of the other application when talking to you through a pipe.

You would need to recompile the other application (and manually adjust the buffering behavior).

Here are some further resources on this topic:

Humans – and applications locking if no output is provided immediately – obviously need interactive output. This is where the pseudoterminal comes in.

The pseudoterminal

The pseudoterminal simulates an interactive terminal to the application. stdio thinks it is talking to a human, and does not buffer. Output is as you would expect it from interacting with Linux applications on the command line (e.g. via SSH).

image

image

As you can see in the output of ps aux, some applications do not have a TTY (terminal) assigned to them (showing up with a question mark “?”) – in this case, expect the applications to show a different default buffering behavior.

Pseudoterminals look like this in ps aux:

image

I will decode the information for you:

  • sshd connects to pseudoterminal 0 (pts/0).
  • bash, and some other processes are started on pseudoterminal 0 (pts/0)
  • I use sudo su to run a command as root (which in turn runs su, and then bash): python3 ttt.py
  • this script (which I will show you in a little while) creates a new pseudoterminal pts/1
  • I run /bin/login from my script to check the user credentials. Because I entered them correctly, bash (the default shell) is started on pts/1
  • here I ping miau.de – this process is also executed in pts/1
  • I also start a second SSH connection, which attaches to pts/2 – in this case to run ps aux, to be able to create the screenshot above.

This is what the first SSH connection looks like:

image

(Note to the astute reader: I tried two times, hence first the ping to google)

image

Further reading:

Python backend & more on pseudoterminals

Python has built-in utilities for the pseudo-terminal: pty. I found the documentation to be hard to get into, what a child / master / slave refer to.

I found an example here, in the  Invoke source code. The “good stuff” starts in line 1242 (def start – no pun intended Smile). As you can see, there is a reference to pexpect.spawn doing additional stuff for setup and tear down.

Therefore, I simply decided to use pexpect as a wrapper library.

pexpect

The documentation for pexpect can be found here. It’s main use case is to automate interactive applications.

Think, for instance, a command line FTP client. It can also be used to automate tests of applications.

pexpect creates a pseudoterminal for the applications it launches.

As discussed above, the pseudoterminal  has the big and required advantage of putting these applications into a different buffering mode, which we are used to from interactive applications.

Pseudoterminals behave like real terminals – it has a screen size (number of columns and rows), and applications write control sequences to it to affect the display.

Screen size and SIGWINCH

Linus Akesson has a good explanation on TTYs in general, and also about SIGWINCH. SIGWINCH is a signal, similar to SIGHUP or SIGINT.

In the case of SIGWINCH, it is sent to the child process to inform it whenever the terminal size changes.

This can happen, for instance, if you resize your PuTTY window.

Editors like nano and others (e.g. alsamixer, …) which use the full screen and interact with it, need to know the screen size to do their calculations correctly and render the output correctly.

Thus, they listen for this signal, and if it arrives, reset their calculations.

image

As you can see from this fullscreen example, pexpect sets a smaller screen size than my actual available space in PuTTY – therefore the output size is limited.

This brings us to:

Control sequences (ANSI escape codes)

How is it possible to color output on the command line?

image

How is it possible for applications to show interactive progress bars on the command line? (e.g. wget):

image

This is done using control sequences, embedded in the output of the application. (This is an example of in-band information being communicated additionally, much like the phone companies used to signal billing information, etc. in-band as well, allowing phreakers to abuse the system!)

This, for example, will produce a new line: \r\n

\r is for carriage return, it moves the cursor to the beginning of the line without advancing to the next line.

\n line feed, moves the cursor to the next line.

Yes, even in Linux – in files usually \n alone will mean new line and imply a carriage return, but for application output to be interpreted correctly by the terminal, \r\n is the actual output to go to the next line!

This is output by the TTY device driver.

Read more about this behavior in the pexpect documentation.

To create colored text, or move the cursor to any point on the screen, escape sequences can be used.

These escape sequences start with the ESC byte (27 / hex 0x1B / oct 033), and are followed by a second byte in the range 0x40 – 0x5F (ASCII  @A-Z[\]^_ )

one of the most useful sequences is the [ control sequence introducer (or CSI sequences) (ESC is followed by [ in this case).

these sequences can be used to position the cursor, erase part of the screen, set colors, and much more.

Thus, you can set the color of your prompt in the ~/.bashrc like this:

PS1=’${debian_chroot:+($debian_chroot)}\t \[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;37m\]\w \$\[\033[00m\] ‘

This used to be “magic incantations” to me before I understood this topic. you can now recognize the control sequences, which drive your PuTTY interpretation directly

  • \[\033[01;32m\]

In this case \[ and \] are related to the bash

The control sequence is inside: \033 is for ESC (in octal representation), then we have the [, and then 01;32m is the actual sequence setting the foreground color.

Further reading:

Actual pexpect code

Here are some useful snippets of pexpect code (note, pexpect needs to be installed on your system first in the usual way, see pexpect documentation).

import pexpect
import time

child = pexpect.spawn([‘login’], maxread=1)

time.sleep(0.5)
print(child.read_nonblocking(size=30, timeout=0))

child.delaybeforesend = None

# this sends a right arrow key
child.send(“\033[C”)

child.interact()

maxread=1 sets buffering to none.

Important: We would set the delaybeforesend to None, as we will be funneling real user input through pexpect, which has built-in delays by nature; and we do not want to increase latency unnecessary!

For actual non-interactive use (main use case of pexpect), the default is recommended.

interact() will show the output of the application directly, and send your user input to the application.

Please note: for the live example above, child.interact() was run directly after the pexpect.spawn() statement.

Web-frontend

The plot thickens. What we need on the other side is a JavaScript application capable of understanding and rendering these control sequences. (I have also seen them to be referred to as VT-100 compatible).

My research has led me to xterm.js

It promises performance, compatibility, unicode support, being self-contained, etc.

It is shipped through npm, which is good for my new workflow for picockpit-frontend.

Many applications are based on xterm.js, including Microsoft Visual Studio Code.

Transport

Having components on both sides ready, the only thing which is missing is the transport. Here we have to talk about latency, and what users expect.

Since this is an interactive application, we will need to send characters which the user types one by one to the backend – which should (depending on the active application) immediately echo these characters back.

No buffering is possible here – the only thing which is possible is that the application will send more data as a reply, or on it’s own, which we can send as a package of data.

But the user input has to be streamed character by character.

My initial thought on this was to send individual, stripped down MQTT messages.

But this conflicts with the user’s privacy.

Even though the data is run through WebSockets (and thus https), the messages pass unencrypted through the picockpit.com MQTT broker (VerneMQ).

Shell interactions will include passwords, and sensitive data – thus a secure channel MUST be established.

Currently I am thinking of using websockets, and possibly some kind of library along for it, to multiplex several different data streams through one connection each.

The transport bit is the reason that I am delaying TermiShell until the 3.0 release of PiCockpit – additional tooling has go to into the transport, allowing the transport to be reused for other applications and use cases as well.

Possibly this might be interesting: