Unable to load exported binary data on Mac (overflow error)

Hi -

I am trying to load exported binary data via the python API for the first time and it throws an overflow error.

Data generation:

  • run triggered capture for a few seconds (125 Msps, 2 channels but affects any capture)
  • File > Export raw data > All time + binary + others default > save files

Loading:

  • Follow canonical example, verbatim from “Loading a previously saved (or exported) capture”

Error:

Error loading capture from …: OverflowError: Python int too large to convert to C long

Versions

  • Logic 2, version 2.4.39, MacOS
  • MSO API, version 0.5.3
  • Python 3.14

Is this a known bug in the API or am I doing something wrong?

Thanks!
Joey

TL;DR:

Use the right method based on the datafile type: capture (*.sal) or binary (*.bin).

The Logic MSO python API is still new (v0.5.3), so not full-featured yet for every use case.

Details:

If you exported binary data, then look at read_file:

Note: this method is not well defined yet, as per the current (v0.5.3) API documentation:

It’s primarily used internally by the Capture class, but can also be used directly for advanced use cases.

Otherwise, access and use the binary format directly:

However, if you saved a capture with MSO.capture() (or save capture from GUI, not export data):

After saving a capture, look at the Re-Loading Previously Saved Captures documentation. With this, you will have the higher-level Capture class that is more defined than the read_file method above:

Thanks for the reply!

I experimented with both capture() and read_file() earlier but your comment helped to clarify the proper usage for me. I was getting errors with read_file() which is why I had started testing capture() instead. Here is a corrected example with sample file and instructions to reproduce the error.

  1. Record some test data.

  2. Export a binary using File > Export raw data as previously described. Name the file ‘test’, resulting in test_analog_0.bin in the case of one channel. I’ve zipped and attached a sample file.

  3. Following the API for loading binary files, run this:

from saleae.mso_api.binary_files import read_file

fn = 'test_analog_0.bin'
result = read_file(fn)

Here is the resulting error:

C:\Users\jcdoll\AppData\Local\Programs\Python\Python313\Lib\site-packages\numpy\_core\memmap.py:263: RuntimeWarning: overflow encountered in scalar multiply
  bytes = int(offset + size*_dbytes)
Traceback (most recent call last):
  File "<python-input-9>", line 1, in <module>
    result = read_file(fn)
  File "C:\Users\jcdoll\AppData\Local\Programs\Python\Python313\Lib\site-packages\saleae\mso_api\binary_files.py", line 212, in read_file
    saleae_file.contents = _read_analog_export_v1(f, file_path)
                           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "C:\Users\jcdoll\AppData\Local\Programs\Python\Python313\Lib\site-packages\saleae\mso_api\binary_files.py", line 383, in _read_analog_export_v1
    voltages = np.memmap(file_path, dtype=np.float32, mode="r", offset=current_offset + 40, shape=(num_samples,))
  File "C:\Users\jcdoll\AppData\Local\Programs\Python\Python313\Lib\site-packages\numpy\_core\memmap.py", line 289, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
OverflowError: memory mapped length must be positive

Sample file to reproduce, or use any GUI exported .bin file

test_analog_0.zip (2.4 MB)

For comparison, running the full programmatic capture and reload flow is fine

# Capture some data programmatically
from pathlib import Path
from saleae import mso_api

from pathlib import Path
from saleae import mso_api

mso = mso_api.MSO()

capture_config = mso_api.CaptureConfig(
    enabled_channels=[
        mso_api.AnalogChannel(channel=0, name="clock"),
    ],
    analog_settings=mso_api.AnalogSettings(sample_rate=100e6),
    capture_settings=mso_api.TimedCapture(capture_length_seconds=0.1),
)

save_dir = Path('my-capture')
capture = mso.capture(capture_config, save_dir)

# Load the capture
cap = mso_api.Capture.from_dir(save_dir)

# Print information about the capture
print(f"Channels: {cap.analog_channel_names}")
print(f"Sample rate: {cap.analog_sample_rate} Hz")
print(f"Duration: {cap.analog_stop_time - cap.analog_start_time:.6f} seconds")

But it does result in more than just the .bin output

analog_0.bin
capture_stdout.txt
record_result.json
capture_config_full.json
record_options.json

So the bug looks like an overflow error when you try to load GUI-exported .bin files via read_file().

Thanks!
Joey

I wonder if read_file is intended for the internally stored *.bin (within *.sal) vs. exported from GUI *.bin files? I haven’t tried that API myself, as it isn’t as well documented, but I recall there was a difference between the exported vs. internally captured binary format. Also, I didn’t see a way to export binary from MSO, only non-MSO python automation API.

It would be nice to standardize the APIs between the two, but the hardware and capabilities aren’t the same. I understand it would be quite a challenge to define a superset API that handles both gracefully, emulates common features not directly supported when possible, and provides helpful hints / exceptions when an API could only be supported on one hardware type. This problem is compounded by two different GUIs with unique features for each hardware type, too. For now, only the MSO side gets the extra analog features in the automation API, but perhaps the non-MSO will benefit sometime in the future.

Edit: an older post about the internal binary format for orginal Logic 2 *.sal files (likely out of date for new MSO captures and possibly newer Logic 2 software releases):

… if this is what read_file actually handles, then I hope the online documents are updated to clarify, as they currently point to the exported binary format for reference.

For my use case I needed to do some detailed python analysis of some captured data, so needed a way to capture + import + analyze.

Two solutions so far to work around the problem on my end:

  1. Capture in GUI, export as CSV, import and process (easy but large files)
  2. Capture via python API and either immediately analyze or reload and analyze

FWIW the result of capturing two channels via the python API are:

  • analog_0.bin
  • analog_1.bin
  • capture_config_full.json
  • capture_stdout.txt
  • record_options.json
  • record_result.json

There is no .sal file, but the python code can load and plot the results via read_file() as long as the json files accompany it. This addresses your question if read_file() only works on bin within sal files - the answer is no.

I don’t have the bandwidth to dig through the python API to see what is going on, but it appears that it uses the data in the json files to avoid the overflow problem I ran into.

So it would be nice if either:
A) the python API could safely load binary output from the GUI or
B) the GUI binary export also included the magic additional files to allow for processing via API

Thanks @jcdoll1 for reporting this, and sorry for the trouble!

We found the issue. There is a bug in the binary parsing code in the MSO API library, that would not correctly parse export files that contain more than one waveform.

When using Logic MSO from the Logic software, every waveform (trace) that you see is kept in memory until the PC-side buffer is full. That means that during typical use, hundreds or thousands of traces are stored in memory. You can view these later by entering history mode. When you export analog data that was recorded with Logic MSO, you have options to export all waveforms, or just the current waveform. In this case, your export file contained over 200 stored waveforms.

This can be done using the MSO API as well, by setting the number of MinWaveforms to a value greater than 1.

In either case, the bug in the MSO API parse code failed to handle parsing more than 1 waveform. I have a quick fix open in a pull request now: https://github.com/saleae/mso-api/pull/10

I expect we’ll be able to get that shipped pretty soon, but I haven’t had a chance to talk with the engineer who runs the MSO API quite yet.

In the meantime, you could either just make that change locally, or you can clone the repo and checkout the branch with the fix locally, and use that instead of the installed version. In my case, I just used the python error output to get the path of the crashing file and edited it in place for now.

Thanks again for reporting this! I look forward to getting this shipped in the next release.