RPC error

I have found a reproducible error when communicating with the Saleae.

I have a short capture set up of 0.2S in a loop (it happens in the same way with longer captures of 5 seconds). The following error happens after around 1900 iterations (it varies, sometimes as low as 1850 or as high as 1930).

INFO:saleae.automation.manager:sub ChannelConnectivity.CONNECTING
INFO:saleae.automation.manager:sub ChannelConnectivity.TRANSIENT_FAILURE
Traceback (most recent call last):
File “test_saleae1.py”, line 73, in
test_the_saleae()
File “test_saleae1.py”, line 44, in test_the_saleae
capture.save_capture(filepath=capture_filepath)
File “C:\Python37\lib\site-packages\saleae\automation\capture.py”, line 391, in exit
self.close()
File “C:\Python37\lib\site-packages\saleae\automation\capture.py”, line 347, in close
self.manager.stub.CloseCapture(request)
File “C:\Python37\lib\site-packages\grpc_channel.py”, line 946, in call
return _end_unary_response_blocking(state, call, False, None)
File “C:\Python37\lib\site-packages\grpc_channel.py”, line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = “failed to connect to all addresses; last error: UNAVAILABLE: ipv4:127.0.0.1:10430: WSA Error”
debug_error_string = “UNKNOWN:failed to connect to all addresses; last error: UNAVAILABLE: ipv4:127.0.0.1:10430: WSA Error {grpc_status:14, created_time:“2024-05-03T16:14:13.794837763+00:00”}”

Windows 10 Pro
Saleae = 0.12.0
python 3.7.7

saleae_test.py (2.7 KB)

@nick.smith Sorry to hear about that! We’ll get your python script on our queue to review and we’ll follow up with our findings.

Thank you Tim, let me know if you need any more information from me

If you want to debug more on your own, you might try using Wireshark to capture the network traffic. The Automation API uses gRPC as the underlying logic2-automation protocol and there are methods to analyze gRPC messages within Wireshark itself.

Just looking at the error message you showed above:

It seems like the gRPC server connection was intermittently failing:

ipv4:127.0.0.1:10430

… according to the status code 14 (UNAVAILABLE), per the gRPC documentation:

The service is currently unavailable. This is most likely a transient condition, which can be corrected by retrying with a backoff. Note that it is not always safe to retry non-idempotent operations.

Are you starting & running Logic.exe locally as a standard process natively on Windows 10 desktop/laptop console, or are doing something more elaborate (e.g., using a virtual machine or container, running remotely, etc.)?

Finally, if the issue is more like a race condition internal to the Saleae Logic 2 software itself vs. some type of local networking issue on the PC, you might be able to work around it by adding some delays between automation API calls: like inserting a time.sleep() call before each capture.command in your script?

Note: the line numbers in the quoted error message don’t appear to line up exactly with the attached saleae_test.py script (as the attached file was only 58 lines):

However, the traceback suggests it might be something between the capture.export_raw_data_csv() and capture.save_capture() API calls? If so, perhaps the ‘export csv’ didn’t completely finish and the gRPC server is still busy when the ‘save capture’ is called – so you could try adding a time.sleep() of a few seconds in between those calls first?

One last clue – there was another post about exporting raw data CSV sometimes failing, which might be related to your issue:

[Edit:]
Looks like Saleae has an example for automating long captures that is similar to your code, in their technical FAQ:

… where they suggest using python threading to reduce latency in between captures. This technique might help, especially if you need to insert a delay between saving the raw CSV and the capture file.

Alternatively, you could temporarily omit saving the raw CSV file and see if saving the capture file only resolves the problem? If so, you can always export the CSV file from the capture files (*.sal) later, rather than exporting it while collecting the data in the first place. For example, you could create a separate script that would use the automation API load_capture() method and export raw data CSV from there?

Hi,

To answer your question, I am starting & running Logic.exe locally as a standard process natively on Windows 10 desktop.

I will try:

Adding a delay between each capture command
Try some experiments with the export_raw_data_csv. I could try removing it altogether, to see if this causing the issue

Ah, yes I did remove some debug from the script I uploaded here. The original traceback message was from the code with debug prints added.

Nick

I have an updated script for you to try.

Somewhere between 1500 and 4000 captures, the Logic 2 application will crash. This is what causes the gRPC error.

I tried the following, without success:
Removing all other USB devices connected to the PC
Commented out the save_capture function call, so its not saving or exporting anything

In the new script, I added a recovery mechanism to relaunch the Logic 2. This makes sure that the test will run all the way through to 4000 captures.

I also noticed that the capture time increases through out the test. I plotted the time taken per capture against the capture number in the attached image. Just before the application crashes, the captures are taking a long time, anywhere between 5 and 10 seconds (it’s set to do a 0.2S capture).

test_saleae.py (4.6 KB)

@nick.smith Oh wow… there’s quite a lot to unpack and review here. Something may be leaking or building up behind the scenes.

I’m not sure if we’ll have an answer for you right away, but I’ll get this on the queue to review here in more detail.

Hi @nick.smith,

This is an issue we’re aware of, and several other users have reported it. Thanks for all of the details!

Your sample script & data helps a lot, that will make it much easier to reproduce locally.

Unfortunately our small team (3 developers) is slammed at the moment and we won’t be able to properly investigate this until a little later this year. In the meantime, We recommend using automation.Manager.launch to automate periodically relaunching the application at a fixed interval, instead of waiting for the application to crash.

I’m sorry we’re not able to get into this sooner!

Hi @markgarrison, ok thanks for the update.

If I understand correctly, based on the following descriptions in the API:

Launch the Logic2 application and shut it down when the returned Manager is closed.
Close connection to Saleae backend, and shut it down if it was created by Manager.

Is there a method to shut down the Logic 2 application if it was not created by Manager (it was opened manually)?

Hi @nick.smith, I’ll jump in for Mark for now. To clarify, are you asking for a way to close an existing Logic software session that was initially opened manually (i.e. not through Automation means)? If so, this is unfortunately not possible at the moment. In particular, we currently don’t have a way of executing Automation API functions on existing sessions. We’re tracking this particular feature request below right now. Feel free to add your vote to it in case this is what you needed!

I don’t think there is a generic method implemented in the automation API, but I think you can:

  • invoke a python subprocess to an external OS specific ‘task killing’ command (to search/kill the ‘Logic 2’ process)
  • use the python psutil to terminate or kill the ‘Logic 2’ process