CSV Export Timestamp: Loss of Resolution?

BitBob · December 4, 2023, 10:43pm

While drafting an HLA for the Quadrature Encoder post, I noticed a minor discrepancy in calculated rate when calculating within python vs. calculating in Excel from the Simple Parallel Analyser’s CSV exported output.

I think I narrowed it down to the possibility that GetTimeString() helper function from the Analyzer SDK may not provide/output the full resolution that is available within the Logic 2 software’s internal timestamps?

In particular, I observed:

The FrameV2 Start field shows: 1.000 024 872 s

However, the Simple Parallel CSV output for this same line shows:

"Simple Parallel","data",0.999963752,6.1116e-05,0x0000000000000000
"Simple Parallel","data",1.00002487,6.1024e-05,0x0000000000000000

And the same timestamp is also seen from the HLA export to CSV:

"Quadrature Encoder","QuadEncoder",0.999963752,6.1116e-05,10961,16382.699868925474,16441.5,24574.04980338821
"Quadrature Encoder","QuadEncoder",1.00002487,6.1024e-05,10962,16361.256544543792,16443,24541.884816815687

Which causes the following discrepancies (when computing a rate value):

>>> print(1/(1.000024872 - 0.999963752)) # full timestamp precision
16361.256544495765
>>> print(1/(1.00002487 - 0.999963752))  # CSV output timestamp precision
16361.791943431652

As you can see from the quadrature encoder analyzer output, the rate calculated matches the internal (full resolution) timestamp value: 1.000024872

However, if you try to use Excel to calculate the same value – it will match the other (reduced resolution) value due to the CSV timestamp missing the last digit.

Is the GetTimeString() being limited to only 9 significant digits vs. 1 ns resolution? I would have expected 1.000024872 instead of 1.00002487 to be output in the CSV output file (to exactly match the value in the table view and I assume the internal timestamp value). However, it seems like there is some rounding / slight loss of precision going on here? Or, is there any other way to get the timestamp value in the CSV output to exactly match the GUI’s Start column in the Data table view display?

Note: the most pedantic eyes might notice the calculations are still ‘slightly off’ between:

16361.256544543792 (value calculated by the python script extension)
vs.
16361.256544495765 (value calculated by Python 3.11.2 interpreter above)

… but I’m guessing that might be a different quantization-like issue related to Saleae’s internal saleae.data.GraphTime data type vs. python’s built-in float data type (i.e., IEEE-754 binary64)?

timreyes · December 5, 2023, 9:27pm

@BitBob Thanks for letting us know about this! There seems to be some kind of rounding discrepancy occurring somewhere in the pipeline. There’s quite a bit to unpack here, and we appreciate all of the detailed analysis you provided!

The next step would be to try and reproduce this on our end. We’ll keep you updated on our findings.

markgarrison · December 6, 2023, 10:35pm

Hi @BitBob,

GetTimeString() should have a precision of 15 places.

The data table in the sidebar, when showing HLA results, or LLAs with FrameV2 support (like the simple parallel analyzer) has a precision of 9 places.

However, when you export the sidebar data table, or when you copy & paste rows out of the data table, we actually use another system to generate those time strings.

At first glance, it looks like that system should have 9 places of resolution too, however I can’t test it locally because none of the data I have on hand has anything but zero in the 1ns place.

Could you send me a copy of a capture, where you were able to produce that simple parallel output?

BitBob · December 7, 2023, 4:34am

The file I used was attached to quad encoder thread I linked above, but pasting link here for quick reference:
https://discuss.saleae.com/uploads/short-url/jwGy5gdCfOvJLkUP0FCR8vaJzEx.sal

(Hopefully the link can be cross-posted)

Original post:

cwilliamson · December 7, 2023, 7:00am

I’ve also noticed this, I’ve attached two files of a clock output where the .sal has the exact expected values of the clock but the .csv doesn’t have the same resolution.
The csv is a .txt so I can attach it.
cpu_div_1000.sal (5.3 KB)
cpu_div_1000.txt (115.8 KB)

markgarrison · December 12, 2023, 12:03am

Thanks, I see the problem now, we’re actually fixing the number of significant digits, not the resolution.

Simple Parallel	data	0.051267576	0.001118744	0x0000000000000000
Simple Parallel	data	0.052386324	0.001148792	0x0000000000000000

Simple Parallel	data	2.30576561	0.001721168	0x0000000000000000
Simple Parallel	data	2.30748678	0.00139284	0x0000000000000000

Async Serial	data	14.8612715	9.8825e-05	0x01
Async Serial	data	14.8614796	9.8825e-05	0x02

We’re just setting the precision on the C++ stream, which has this behavior (which I didn’t know about).
https://en.cppreference.com/w/cpp/io/ios_base/precision

I’m experimenting with a fix right now, it might be as simple as switching to std::fixed for this.

BitBob · December 12, 2023, 1:52am

Yeah, It looks like: https://cplusplus.com/reference/ios/fixed/
discusses the difference in behavior for the precision setting depending on which format is active (i.e., default vs. std::fixed vs. std::scientific).

So, maybe either std:fixed with [set]precision(9+) or else keep as default w/ [set]precision(15-17) (??) I think it depends on whether you want extra padded zeroes after the decimal point, or not (how std::fixed formatting will behave).

Otherwise, if you’re using C++20+ capable C++ compiler, there’s also std::format, per: Formatting library (since C++20) - cppreference.com
(this method doesn’t modify the stream state)

Finally, I also found: GitHub - fmtlib/fmt: A modern formatting library
… if you want to use a formatting library w/o needing C++20+ support in the compiler.

(also, per the readme.md of {fmt} you might get a performance boost when outputting floating-point numbers vs. using the native formatting of C++ iostreams, which might help the CSV export performance for large datasets )

Note: I don’t know how long of a trace you’d need to accumulate before the binary64 floating-point format would ultimately force you to lose the 9-digit precision to get the extended range (an inherent issue w/ floating-point formats). However, I think it could be as early as a day or two, depending on what you pick for the precision setting (as 86400 seconds per day, or 24h period, which requires 5 digits). So, maybe std::fixed will still lose precision but just pad with extra 0’s as the timestamp magnitude gets large enough? Regardless, an updated precision setting and/or format change will at least last for >1s, which is better than the baseline behavior.

[Edit:]
For a deeper dive into the technical details for resolving this, I did a few calculations and made some observations:

IEEE-754 binary64 has an effective 53-bit mantissa (or significand), which implies a range of [0 … 9,007,199,254,740,991] (i.e., 0 … (2^53-1))
For supporting 1 ns resolution, that would imply ~9M second range (~100 days) before risk of losing full resolution
30 days has 30 * 24 * 60 * 60 = 2,592,000 seconds, or 7 digits of integer range
Adding 9 more fractional digits results in total of 16 significant digits needed for a 30 day range with 1 ns resolution
Alternatively, implementing an integer or fixed-point solution, such as using a 64-bit integer counter of nanoseconds (or even picoseconds) would have a larger range and perfect resolution
(i.e., no ‘inexact’ conversion issues between binary64 floating-point encoding and decimal text string)
(2^64 - 1) = 18,446,744,073,709,551,615 which has >18 billion second (>570 year) range @1 ns resolution
The decimal value of 0.000000001 in IEEE-754 binary64, is actually internally stored approximately as 0.00000000100000000000000006228159145778
According to another source, it is actually 0.0000000010000000000000000622815914577798564188970686927859787829220294952392578125

Finally, as an alternative feature that may not have as much impact to existing implementation, could an extra checkbox be added in Export Table Data to output the internal sample counter (u64)? If so, the CSV output could be something like:

# SampleRateHz = 500000000
# SampleNumberAtZero = 589218
name,type,sample_number,start_time,duration,"position","rate","scaled_position","scaled_rate"
"Quadrature Encoder","QuadEncoder",0,-0.001178436,0.001178432,1,0,2,0

OR

SampleRateHz,500000000
SampleNumberAtZero,589218
name,type,sample_number,start_time,duration,"position","rate","scaled_position","scaled_rate"
"Quadrature Encoder","QuadEncoder",0,-0.001178436,0.001178432,1,0,2,0

Then, user can calculate with equation:
timestamp = (sample_number - SampleNumberAtZero) / SampleRateHz

Given:

SampleRateHz = output from GetSampleRate()
SampleNumberAtZero = sample_number value at capture waveform at 0 s reference point (i.e., t0)

markgarrison · December 12, 2023, 6:04pm

Hi BitBob,

We have a fix in place now that will maintain 9 digits of decimal, but will still trim trailing zeros. (After looking at some sample exports, I decided that trimming trailing zeros is still a nicer overall output format.
I expect this to be in the next software release.

We extensively use the fmt format library on our codebase!

As for time resolution, you brought up some interesting points! This is something we addressed very early in Logic 2. The old Logic 1.x software was very much integer sample number first, and we infrequently converted to floating point time. This became much more complicated when we introduced analog sampling at a different rate from the digital sampling, and the application was modified to use the lowest common multiple sample rate in most cases. One negative effect of this is that the analog and digital synchronization was impossible to align well when the analog sample rate was much, much slower than the digital rate, because we could only adjust the synchronization by in integer number of analog samples.

Logic 2 is very much time first under the hood, and we typically use absolute timestamps relative to the Unix Epoch. As you noticed, doubles don’t have the dynamic range to support timestamps of the required accuracy, so under the hood we have a custom type that we use to represent these time stamps.

We only use double precision numbers for relative times, and even then, only at the edges of the application.

Topic		Replies	Views
Analog export timestamp does not match capture Logic 2 Software	13	1200	October 26, 2021
GraphTimeDelta with high resolution Extensions	9	388	March 20, 2024
Data difference in .csv files Support	17	1049	January 17, 2024
Timestamp mismatch in binary export with version 2.4.5 Logic 2 Software	8	621	February 6, 2023
Exporting analog raw data CSV only shows up to 0.001 Logic 2 Software	2	51	November 24, 2024

CSV Export Timestamp: Loss of Resolution?

Related topics