Add support for running process asynchronously#835
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #835 +/- ##
========================================
+ Coverage 45.9% 46.6% +0.6%
========================================
Files 180 181 +1
Lines 14496 14760 +264
========================================
+ Hits 6666 6880 +214
- Misses 7830 7880 +50
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. |
|
Overall this design looks like what we had in mind in #552. Thank you very much for the implementation, looking forward to testing this on my lab at home. |
a336b91 to
1013b98
Compare
|
It is unfortunately a little hard to write automated tests for most of this since it requires actual hardware interaction, but I'll take a stab at it. I was able to test on actual hardware (Raspberry Pi 4 w/ u-boot) with the following test suite: import socket
import os
import time
def test_uboot_async(uboot_command):
with uboot_command.popen("echo hello") as p:
assert p.read(4).decode("utf-8").splitlines() == ["hell"]
assert p.read().decode("utf-8").splitlines() == ["o"]
with uboot_command.popen("echo bzap hello bzap") as p:
index, before, match, after = p.expect("hello")
assert index == 0
assert match.group(0).decode("utf-8") == "hello"
assert before.decode("utf-8") == "bzap "
assert after.decode("utf-8") == "hello"
assert p.read().decode("utf-8").splitlines() == [" bzap"]
def test_ssh_async(ssh_command):
with ssh_command.popen("echo hello") as p:
assert p.read(4).decode("utf-8").splitlines() == ["hell"]
assert p.read().decode("utf-8").splitlines() == ["o"]
with ssh_command.popen("echo bzap hello bzap") as p:
index, before, match, after = p.expect("hello")
assert index == 0
assert match.group(0).decode("utf-8") == "hello"
assert before.decode("utf-8") == "bzap "
assert after.decode("utf-8") == "hello"
assert p.read().decode("utf-8").splitlines() == [" bzap"]
def test_shell_nc(shell_command, ssh_command):
with shell_command.popen("nc -l -p 5555") as p:
# Wait for netcat to start listening
time.sleep(2)
with ssh_command.forward_local_port(5555) as local_port:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.connect(("localhost", local_port))
s.sendall(b"Hello World")
p.expect(b"Hello World")
time.sleep(2)
def test_ssh_nc(ssh_command):
with ssh_command.popen("nc -l -p 5555 127.0.0.1") as p:
# Wait for netcat to start listening
time.sleep(2)
with ssh_command.forward_local_port(5555) as local_port:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.connect(("localhost", local_port))
s.sendall(b"Hello World")
p.expect(b"Hello World")
time.sleep(2)
def test_shell_async(shell_command):
with shell_command.popen("echo hello") as p:
assert p.read(4).decode("utf-8").splitlines() == ["hell"]
assert p.read().decode("utf-8").splitlines() == ["o"]
with shell_command.popen("echo bzap hello bzap") as p:
index, before, match, after = p.expect("hello")
assert index == 0
assert match.group(0).decode("utf-8") == "hello"
assert before.decode("utf-8") == "bzap "
assert after.decode("utf-8") == "hello"
assert p.read().decode("utf-8").splitlines() == [" bzap"]
def test_shell_interrupt(shell_command):
with shell_command.popen("yes") as p:
l = p.read(1024).decode("utf-8").splitlines()
assert l == ["y"] * len(l)
shell_command.run_check("true")Note that I wasn't able to test the barebox driver, so manual testing there would be appreciated |
1013b98 to
6dd0e24
Compare
|
I added some tests, which revealed a few bugs that needed fixed. I think it's all good to go now though (although I still have no way to test barebox) |
6dd0e24 to
f837162
Compare
I added a little to |
580e931 to
88b4826
Compare
Bastian-Krause
left a comment
There was a problem hiding this comment.
I've tested this briefly now.
Issues I stumbled upon:
BareboxDrivernow times out onrun()- running multiple processes on
BareboxDriver/ShellDriveris not forbidden, but runs in timeouts - calling
stop()onConsoleMarkerProcessreturned byShellDriver.popen()leads to a timeout (e.g. stopping a "sleep 100") - the implementation looks incomplete, I don't see the prompt being detected onstop()
Open questions:
- why is
SSHDriver.run()not usingpopen()? - I do not really understand the difference between
read_nonblocking()/read()
Styling:
- we use f-strings where appropriate since #838, please do the same
Note: this is no comprehensive review yet, just a collection of things I saw while testing with a QEMU and a physical target.
|
|
||
| @step(args=['size', 'timeout'], result=True) | ||
| def read(self, size=-1, *, timeout=None): | ||
| t = OptionalTimeout(timeout, self._console.timeout) |
There was a problem hiding this comment.
QEMUDriver and ExternalConsoleDriver do not implement a timeout attribute.
| output = p.read(timeout=timeout) | ||
| # Remove VT100 Codes and split by newline | ||
| data = self.re_vt100.sub('', output.decode('utf-8')).split('\r\n')[1:-1] | ||
| self.logger.debug(f"Received Data: {data}") |
There was a problem hiding this comment.
log functions should not use f-strings, but rather lazy % formatting. There are multiple occurrences of this spread in this PR.
There was a problem hiding this comment.
Is there a reason why? It seem confusing to developers to mix the two styles without a good reason :/
There was a problem hiding this comment.
Yes, there is a reason. Here is a short wrap-up: https://medium.com/swlh/why-it-matters-how-you-log-in-python-1a1085851205
TL;DR: formatting is only done if required (i.e. the log level is reached) and errors are handled gracefully.
There was a problem hiding this comment.
Reading the article has confused me more; IIUC, the article is suggesting that logging should use lazy evaluation by passing format arguments as parameters (not using %), in which case the log message would be something like self.logger.debug("Received data: %s", data). However, it seems you are suggesting I use % instead, which is not a lazy evalutation and therefore not superior (at least in terms of what the article is discussing) from f-strings.
I think what you want to do is either use actual lazy evaluation for log messages, or non-lazy f-strings for consistency with everywhere else f-strings are used
There was a problem hiding this comment.
With "% formatting" I mean logger.debug("%s", data).
There was a problem hiding this comment.
OK, I will do that
| - AndroidFastbootDriver now supports booting/flashing images preconfigured in | ||
| the environment configuration. | ||
| - CommandProtocol now supports a popen() method to run a process on the target | ||
| asynchronously with the pytest code. |
There was a problem hiding this comment.
Why mention pytest here? The PR does not touch anything pytest-specific.
| data = self.re_vt100.sub( | ||
| '', output.decode('utf-8'), count=1000000 | ||
| ).replace("\r", "").split("\n") | ||
| self.logger.debug("Received Data: {data}") |
There was a problem hiding this comment.
I guess this is meant as an f-string. Please convert to lazy % formatting.
There was a problem hiding this comment.
Per previous discussion, % is not lazy
|
|
||
| def gen_marker(): | ||
| return ''.join(random.choice(string.ascii_uppercase) for i in range(10)) | ||
| return "".join(random.choice(string.ascii_uppercase) for i in range(10)) |
There was a problem hiding this comment.
No need to change this line.
| will never expire. If timeout negative and default is default is not None, | ||
| the default timer value will be used |
There was a problem hiding this comment.
This sentence needs revisiting.
I don't have any way of testing barebox, can you provide more information so I can fix it?
Do we want to explicitly disallow that? I'm fine with that if that's what everyone wants to do.
It would be nice, but I can't do that without breaking the SSH driver because of the stderr_merge option, which is not possible to implement with popen()
I tried to use f-strings everywhere.... IHMO it's confusing to developers to require different string formatting for different uses (unless there is some really compelling performance reason to do otherwise).
|
Here is my demo script: #!/usr/bin/env python3
import labgrid
labgrid.StepReporter()
env = labgrid.Environment("beaglebone.yaml")
target = env.get_target()
barebox = target.get_driver("BareboxDriver")
target.activate(barebox)
barebox.run("echo hi")This results in: I have not looked into it yet. |
a2719bd to
53b2f8c
Compare
|
I think I've addressed most of the comment, is there anything else needed to merge this? |
|
@JoshuaWatt Thanks, re-requesting review or a short notice like yours is always welcome, so we know when to look at PRs again. I need to look at/test the current state again. I'll come back to you. |
|
How would you run a command that takes longer than 30 seconds? This ends up with a Another thing: how would you read the output of a long-running command (e.g. |
Ah, this has nothing to do with the long-running command. The context manager hits CTRL-c on close, which prevents the second marker from being echoed (on the shell, not in barebox though). That's why we run into the timeout there. |
|
Does something need to change? It seems like it's doing what you are asking for (launching the process, then immediately terminating it). If you want to wait for it to exit normally, use |
Okay, will try that.
I think the behavior between barebox and shell should be consistent. I would not expect to get a |
Ya, that makes sense; let me see what I can do |
53b2f8c to
83c64e9
Compare
|
@Bastian-Krause Ok, I tracked this down to a behavior difference is bash (which I was using) vs. busybox. I've corrected it and now it should work the same for both shells |
|
For those interested, the difference is: bash: busybox: Note that interrupting the command on busybox prevents all subsequent commands (the |
58d847d to
ba12014
Compare
a9e0e17 to
8546a36
Compare
489db0e to
ac03245
Compare
|
@Bastian-Krause I think this is ready, would you mind looking again? |
|
Thanks for keeping at it! I'll try to have a look at this and run our network test suite against this. |
Due to our tasks shifting, I did not work on our (currently internal) test suite, thus did not manage to get this tested. Sorry. If anyone else could give this a spin, that would help. |
abe5a95 to
f6b278f
Compare
| """labgrid Wrapper of the pexpect module. | ||
|
|
||
| This class provides pexpect functionality for the ConsoleProtocol classes. | ||
| driver: ConsoleProtocol object to be passed in |
There was a problem hiding this comment.
This should explain why we need this in addition to the PtxExpect wrapper in labgrid/util/expect.py. It seems this allows the user to use expect only on the process output, without having to care about the markers.
| def send(self, s): | ||
| raise NotImplementedError("Sending not implemented") |
There was a problem hiding this comment.
Is there a reason why we don't want to implement send?
Allowing the class to take an integer for a timeout instead of a strict float seems more convenient to users than forcing it at a higher level, or breaking usage by changing other APIs to require floats (i.e. run_check()) Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com>
Extends the command protocol to support "background" processes. Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com>
Adds a read mix in class that implements read helper functions to deal with reading from sources that will return only buffered data, instead of trying harder to read up to the user requested size. These functions are modeled after the functions provided by Rust Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com>
Many different drivers need to implement "marker" style command processes where the command output is terminated with a random marker then the command status code and a prompt. Instead of making each driver implement this logic over and over again, implement a common class that handles this type of command output. Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com>
c686d51 to
8e510d5
Compare
Implements the start_process() API for "background" processes using the ConsoleMarkerProcess helper class. Rework the _run() API to use a background process behind the scenes Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com>
Implements the start_process() API for "background" processes using the ConsoleMarkerProcess helper class. Rework the _run() API to use a background process behind the scenes Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com> [rouven.czerwinski@linaro.org: adjusted to UBoot timestamp removal] Signed-off-by: Rouven Czerwinski <rouven.czerwinski@linaro.org>
Implements the start_process() API for "background" processes using the ConsoleMarkerProcess helper class. Rework the _run() API to use a background process behind the scenes Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com>
Implements the start_process() API for SSH connections. The SSHDriverProcess is a wrapper around pexpect.spawn() Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com>
Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com>
Instructs ssh to open a pseudo terminal when running start_process() commands. This allows signals to propagate to the remote process (instead of being intercepted by the ssh client process), emulating the start_process() signal behavior on other drivers. Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com>
Disables echo in the shell when using start_process() with the ssh driver. This prevents duplicate output so it behaves as the other drivers. Signed-off-by: Joshua Watt <Joshua.Watt@garmin.com>
8e510d5 to
84402d4
Compare
|
Would someone mind looking at this change again? Thanks |
Emantor
left a comment
There was a problem hiding this comment.
I gave this a once over and everything seems to be addressed. We just had a release, so its a good time to get this in for additional testing.
Description
Adds support for running processes asynchronously with the main test code. Attempts to incorporate the feed back from #552
Checklist