Skip to content

Conversation

@smcallis
Copy link
Contributor

This removes the dependency on OpenSSL, which we only need for its bignum. The main benefit of this is that targeting things like WASM are simpler without that dependency.

I didn't try as hard for performance optimization as OpenSSL (which also has a hard requirement around constant-time operation for security purposes, thus lots of inline assembly), but benchmarks indicate better performance for small operations (likely due to absl::InlinedVector), and within a factor of two for multiply up to 1024-bits .

           Bignum  OpenSSL
Add_Small  8.75ns  24.7ns
Add_Medium 24.6ns  23.9ns
Add_Large  32.1ns  28.3ns
Mul_Small  9.13ns  26.0ns
Mul_Medium 45.8ns  32.8ns
Mul_Large  249ns   147ns

The size of exactfloats is largely determined by the number of sequential multiplies that have to be done and we don't have that deep of math. I think the worst case is A.CrossProd(b).Dot(c) which would be two multiplies deep, which would get us up to 256-bit values (medium size above). So, I don't expect there to be any real perceivable performance difference.

@google-cla
Copy link

google-cla bot commented Sep 11, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@ericveach
Copy link

Just one quick comment -- there are cases where S2 needs to evaluate exact arithmetic expressions involving polynomials of degree at least 20 (see s2predicates.cc). These cases actually happen, and for certain kinds of input data they can actually occur frequently. So the performance of large multiplies may indeed be relevant, and it would be good to check how this change affects the various S2 benchmarks, especially those related to snap rounding (i.e. code that uses S2Builder).

@smcallis
Copy link
Contributor Author

I didn't realize we had polynomials that big, where specifically? In the symbolic perturbation logic? I know the benchmarks are scrubbed before pushing to github, but since https://github.com/google/benchmark is open source now maybe @jmr can un-scrub them, then I could verify any changes.

@jmr
Copy link
Member

jmr commented Sep 12, 2025

maybe @jmr can un-scrub them, then I could verify any changes.

It will take a while to release them cleanly. It looks like I might need to move the benchmarks to their own files and call BENCHMARK_MAIN().

Here are the benchmarks that touch BigNum::operator*:

s2polygon_benchmark.txt
s2edge_crossings_benchmark.txt
s2buffer_operation_benchmark.txt
encoded_s2shape_index_benchmark.txt

For some, you will need to find your own large polygon like the Pacific Ocean.

@jmr
Copy link
Member

jmr commented Sep 12, 2025

The main benefit of this is that targeting things like WASM are simpler without that dependency.

I see we have wasm tests for exactfloat that pass. Most s2geometry tests pass, too. The most obvious failures are wasm not supporting threads.

@jmr
Copy link
Member

jmr commented Sep 12, 2025

I didn't try as hard for performance optimization as OpenSSL

Are your benchmarks vs OpenSSL or BoringSSL? How do they compare?

@smcallis
Copy link
Contributor Author

smcallis commented Sep 12, 2025 via email

@ericveach
Copy link

ericveach commented Sep 12, 2025 via email

@smcallis
Copy link
Contributor Author

smcallis commented Sep 13, 2025

OK I'm testing against OpenSSL, and I instrumented bignum multiply to print the bit-width of the result. So far the worst case I've found is 17718 bits in the S2BufferOperation.RadiiAndErrorFractionCoverage test. So I added a "mega" class to my benchmarks with 18000 bits:

             OpenSSL  Bignum
BM_Add_Mega  187ns 279 ns
BM_Mul_Mega  44.5us 64.9 us

I ran the S2BufferOperation and there were a few +1% regressions but more where it got faster by up to 10% if you believe the numbers:

                                 OpenSSL       Bignum     Δ
BM_BufferPoints/1                  23205 ns     20911 ns  -9.9%
BM_BufferPoints/10                220006 ns    220017 ns  --
BM_BufferPoints/1000            23319450 ns  23465196 ns  +0.6%
BM_BufferConvexLoop/10/10          29648 ns     29766 ns  --
BM_BufferConvexLoop/1000/10      2581329 ns   2615347 ns  +1.3%
BM_BufferConvexLoop/1000/1000    2541123 ns   2584635 ns  +1.7%
BM_BufferConcaveLoop/10/10         37046 ns     37532 ns  +1.3%
BM_BufferConcaveLoop/1000/10     5171652 ns   5230552 ns  +1.1%
BM_BufferConcaveLoop/1000/100    8716401 ns   8642246 ns  --
BM_BufferLoDimFractal/48          179090 ns    170599 ns  -4.7%
BM_BufferLoDimFractal/3072      34772877 ns  32665130 ns  -6.0%
BM_BufferHiDimFractal/48          276299 ns    248392 ns  -10.1%
BM_BufferHiDimFractal/3072      29193815 ns  27876363 ns  -4.5%

I didn't compute percentages but I'll include them here for posterity:

s2edge_crossings_test:

                                                              OpenSSL   Bignum
BM_GetIntersection<&S2::GetIntersection>                         23.0 ns 23.5 ns
BM_GetIntersection<&GetIntersectionStableLDIgnoreValidity>        117 ns  118 ns
BM_GetIntersection<&S2::internal::GetIntersectionExact>          4921 ns 1516 ns -69%

(intersection got a lot faster here, maybe due to small bignums being inlined?)

encoded_s2shape_index_test:

                                                      OpenSSL         Bignum
BM_MutableIndexBuildDestruct/0/0                         1166 ns        1307 ns
BM_MutableIndexBuildDestruct/1/0                       129538 ns      130681 ns
BM_MutableIndexBuildDestruct/2/0                       195170 ns      192565 ns
BM_MutableIndexDecodeDestruct/0/0                         123 ns         123 ns
BM_MutableIndexDecodeDestruct/1/0                       11401 ns       11254 ns
BM_MutableIndexDecodeDestruct/2/0                       10759 ns       11071 ns
BM_EncodedIndexInitDestruct/0/0                          61.4 ns        72.9 ns
BM_EncodedIndexInitDestruct/1/0                           227 ns         195 ns
BM_EncodedIndexInitDestruct/2/0                           174 ns         154 ns
BM_S2PolygonDecodeDestructSnapped/0/0                     190 ns         192 ns
BM_S2PolygonDecodeDestructSnapped/1/0                    1283 ns        1310 ns
BM_S2PolygonDecodeDestructSnapped/2/0                   12786 ns       12689 ns
BM_S2PolygonDecodeDestructSnapped/0/1                     708 ns         705 ns
BM_S2PolygonDecodeDestructSnapped/1/1                   10563 ns       10605 ns
BM_S2PolygonDecodeDestructSnapped/2/1                   69893 ns       71190 ns
BM_LaxPolygonDecodeDestructSnapped/0/0                   26.9 ns        26.9 ns
BM_LaxPolygonDecodeDestructSnapped/1/0                   1353 ns        1361 ns
BM_LaxPolygonDecodeDestructSnapped/2/0                   1826 ns        1823 ns
BM_LaxPolygonDecodeDestructSnapped/0/1                    126 ns         127 ns
BM_LaxPolygonDecodeDestructSnapped/1/1                  11914 ns       12087 ns
BM_LaxPolygonDecodeDestructSnapped/2/1                  13145 ns       13928 ns
BM_EncodedPolygonInitDestructSnapped/0/0                 5.06 ns        5.37 ns
BM_EncodedPolygonInitDestructSnapped/1/0                 6.49 ns        6.97 ns
BM_EncodedPolygonInitDestructSnapped/2/0                 8.98 ns        9.50 ns
BM_EncodedPolygonInitDestructSnapped/0/1                 8.64 ns        10.1 ns
BM_EncodedPolygonInitDestructSnapped/1/1                 36.4 ns        28.8 ns
BM_EncodedPolygonInitDestructSnapped/2/1                 48.8 ns        38.2 ns
BM_S2PolygonGetEdge/0/0                                  2.67 ns        2.68 ns
BM_S2PolygonGetEdge/1/0                                  2.59 ns        2.59 ns
BM_S2PolygonGetEdge/2/0                                  2.86 ns        2.79 ns
BM_LaxPolygonGetEdge/0/0                                 3.47 ns        3.46 ns
BM_LaxPolygonGetEdge/1/0                                 3.37 ns        3.49 ns
BM_LaxPolygonGetEdge/2/0                                 4.09 ns        4.31 ns
BM_EncodedPolygonGetEdge/0/0                             2.54 ns        2.54 ns
BM_EncodedPolygonGetEdge/1/0                             2.60 ns        2.42 ns
BM_EncodedPolygonGetEdge/2/0                             6.92 ns        6.13 ns
BM_EncodedPolygonGetEdge/0/1                             27.5 ns        27.6 ns
BM_EncodedPolygonGetEdge/1/1                             30.7 ns        30.6 ns
BM_EncodedPolygonGetEdge/2/1                             31.2 ns        32.5 ns
BM_ContainsPointMutableIndexS2Polygon/0/0                 169 ns         167 ns
BM_ContainsPointMutableIndexS2Polygon/1/0                 179 ns         178 ns
BM_ContainsPointMutableIndexS2Polygon/2/0                 516 ns         513 ns
BM_ContainsPointEncodedIndexPolygon/0/0                   246 ns         244 ns
BM_ContainsPointEncodedIndexPolygon/1/0                   239 ns         237 ns
BM_ContainsPointEncodedIndexPolygon/2/0                   723 ns         730 ns
BM_ContainsPointEncodedIndexPolygon/0/1                   498 ns         498 ns
BM_ContainsPointEncodedIndexPolygon/1/1                   485 ns         483 ns
BM_ContainsPointEncodedIndexPolygon/2/1                  1844 ns        1836 ns
BM_VertexDistanceMutableIndexS2Polygon/0/0                278 ns         281 ns
BM_VertexDistanceMutableIndexS2Polygon/1/0                727 ns         708 ns
BM_VertexDistanceMutableIndexS2Polygon/2/0               1906 ns        1913 ns
BM_VertexDistanceEncodedIndexPolygon/0/0                  358 ns         353 ns
BM_VertexDistanceEncodedIndexPolygon/1/0                  854 ns         848 ns
BM_VertexDistanceEncodedIndexPolygon/2/0                 2438 ns        2441 ns
BM_VertexDistanceEncodedIndexPolygon/0/1                  849 ns         840 ns
BM_VertexDistanceEncodedIndexPolygon/1/1                 1572 ns        1536 ns
BM_VertexDistanceEncodedIndexPolygon/2/1                 5096 ns        5097 ns
BM_DecodePolygonBuildIndexContainsPointSnapped/0/0       1457 ns        1486 ns
BM_DecodePolygonBuildIndexContainsPointSnapped/1/0      99953 ns      100508 ns
BM_DecodePolygonBuildIndexContainsPointSnapped/2/0     235551 ns      233505 ns
BM_DecodePolygonBuildIndexContainsPointSnapped/0/1       2030 ns        2045 ns
BM_DecodePolygonBuildIndexContainsPointSnapped/1/1     109201 ns      110685 ns
BM_DecodePolygonBuildIndexContainsPointSnapped/2/1     294724 ns      298937 ns
BM_DecodeIndexAndPolygonContainsPointSnapped/0/0          538 ns         546 ns
BM_DecodeIndexAndPolygonContainsPointSnapped/1/0        20437 ns       19768 ns
BM_DecodeIndexAndPolygonContainsPointSnapped/2/0        25752 ns       24767 ns
BM_DecodeIndexAndPolygonContainsPointSnapped/0/1         1123 ns        1114 ns
BM_DecodeIndexAndPolygonContainsPointSnapped/1/1        29701 ns       29184 ns
BM_DecodeIndexAndPolygonContainsPointSnapped/2/1        82956 ns       83656 ns
BM_EncodedIndexAndPolygonContainsPointSnapped/0/0         307 ns         305 ns
BM_EncodedIndexAndPolygonContainsPointSnapped/1/0         511 ns         484 ns
BM_EncodedIndexAndPolygonContainsPointSnapped/2/0        1069 ns        1033 ns
BM_EncodedIndexAndPolygonContainsPointSnapped/0/1         561 ns         563 ns
BM_EncodedIndexAndPolygonContainsPointSnapped/1/1         885 ns         844 ns
BM_EncodedIndexAndPolygonContainsPointSnapped/2/1        2454 ns        2410 ns

(I'd say generally the same)

@jmr I can run the s2polygon_test benchmarks too, but there's several flags defined in that file defining parameters for the benchmarks I'll need.

@smcallis
Copy link
Contributor Author

smcallis commented Sep 13, 2025

As an aside @jmr if you get those benchmarks released here's a test_main.cc that can work with them:

#include "benchmark/benchmark.h"
#include "gtest/gtest.h"

int main(int argc, char* argv[]) {
  benchmark::Initialize(&argc, argv);
  testing::InitGoogleTest(&argc, (char**)argv);

  // Just run benchmark if explicitly requested.
  if (!benchmark::GetBenchmarkFilter().empty()) {
    benchmark::RunSpecifiedBenchmarks();
    exit(0);
  }

  return RUN_ALL_TESTS();
}

You just link with "gmock" instead of "gmock_main" and that test file and it lets you run the benchmarks like you'd expect with --benchmark_filter.

@ericveach
Copy link

ericveach commented Sep 15, 2025 via email

@smcallis
Copy link
Contributor Author

OK I used the compare.py that ships with benchmark now and disabled ASLR and CPU governors, and a min_time of 20s to get as fair a comparison as I could. All entries are (bignum - openssl)/openssl:

 Benchmark                                       Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------
BM_BufferPoints/1                            -0.0082         -0.0082         18892         18738         18890         18736
BM_BufferPoints/10                           -0.0235         -0.0235        215633        210558        215609        210539
BM_BufferPoints/1000                         -0.0574         -0.0574      23350243      22009805      23347813      22007591
BM_BufferConvexLoop/10/10                    -0.0435         -0.0435         29588         28300         29585         28298
BM_BufferConvexLoop/1000/10                  -0.0101         -0.0101       2592775       2566717       2592526       2566441
BM_BufferConvexLoop/1000/1000                +0.0009         +0.0009       2559509       2561839       2559259       2561581
BM_BufferConcaveLoop/10/10                   -0.0002         -0.0001         37098         37093         37095         37089
BM_BufferConcaveLoop/1000/10                 -0.0126         -0.0126       5177795       5112490       5177284       5111983
BM_BufferConcaveLoop/1000/100                +0.0151         +0.0151       8372052       8498848       8371219       8497918
BM_BufferLoDimFractal/48                     -0.0490         -0.0490        179225        170447        179204        170431
BM_BufferLoDimFractal/3072                   -0.0521         -0.0521      34339890      32551229      34335649      32548081
BM_BufferHiDimFractal/48                     -0.0656         -0.0656        257797        240879        257771        240857
BM_BufferHiDimFractal/3072                   -0.0617         -0.0616      29076370      27283717      29073327      27281040
OVERALL_GEOMEAN                              -0.0286         -0.0286             0             0             0             0

Benchmark                                                                    Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------------------------
BM_GetIntersection<&S2::GetIntersection>                                  +0.0002         +0.0002            19            19            19            19
BM_GetIntersection<&GetIntersectionStableLDIgnoreValidity>                +0.0052         +0.0052           117           118           117           118
BM_GetIntersection<&S2::internal::GetIntersectionExact>                   +0.0037         +0.0037          1338          1343          1338          1343
OVERALL_GEOMEAN                                                           +0.0030         +0.0030             0             0             0             0

Benchmark                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------
BM_MutableIndexBuildDestruct/0/0                                  -0.0047         -0.0046          1055          1050          1055          1050
BM_MutableIndexBuildDestruct/1/0                                  -0.0001         -0.0010        125978        125965        125889        125765
BM_MutableIndexBuildDestruct/2/0                                  -0.0019         -0.0019        193622        193247        193603        193229
BM_MutableIndexDecodeDestruct/0/0                                 -0.0192         -0.0191           118           115           118           115
BM_MutableIndexDecodeDestruct/1/0                                 -0.0573         -0.0573         12000         11313         11999         11311
BM_MutableIndexDecodeDestruct/2/0                                 -0.0375         -0.0375         11211         10791         11210         10790
BM_EncodedIndexInitDestruct/0/0                                   +0.0018         +0.0018            64            64            64            64
BM_EncodedIndexInitDestruct/1/0                                   -0.0237         -0.0237           188           184           188           184
BM_EncodedIndexInitDestruct/2/0                                   +0.0164         +0.0164           145           147           145           147
BM_S2PolygonDecodeDestructSnapped/0/0                             -0.0530         -0.0533           190           180           190           180
BM_S2PolygonDecodeDestructSnapped/1/0                             -0.0176         -0.0176          1274          1251          1274          1251
BM_S2PolygonDecodeDestructSnapped/2/0                             -0.0024         -0.0024         12720         12690         12718         12688
BM_S2PolygonDecodeDestructSnapped/0/1                             -0.0178         -0.0178           689           676           689           676
BM_S2PolygonDecodeDestructSnapped/1/1                             -0.0085         -0.0085         10642         10551         10641         10550
BM_S2PolygonDecodeDestructSnapped/2/1                             -0.0332         -0.0332         70155         67827         70148         67821
BM_LaxPolygonDecodeDestructSnapped/0/0                            -0.0055         -0.0055            27            27            27            27
BM_LaxPolygonDecodeDestructSnapped/1/0                            +0.0507         +0.0507          1297          1363          1297          1363
BM_LaxPolygonDecodeDestructSnapped/2/0                            +0.0011         +0.0011          1817          1819          1816          1818
BM_LaxPolygonDecodeDestructSnapped/0/1                            +0.0000         -0.0001           126           126           126           126
BM_LaxPolygonDecodeDestructSnapped/1/1                            +0.0043         +0.0043         11971         12022         11970         12021
BM_LaxPolygonDecodeDestructSnapped/2/1                            +0.0136         +0.0136         13872         14060         13870         14059
BM_EncodedPolygonInitDestructSnapped/0/0                          +0.0335         +0.0335             5             5             5             5
BM_EncodedPolygonInitDestructSnapped/1/0                          +0.0119         +0.0120             7             7             7             7
BM_EncodedPolygonInitDestructSnapped/2/0                          +0.0059         +0.0059            10            10            10            10
BM_EncodedPolygonInitDestructSnapped/0/1                          +0.1442         +0.1442             9            10             9            10
BM_EncodedPolygonInitDestructSnapped/1/1                          -0.0200         -0.0200            28            28            28            28
BM_EncodedPolygonInitDestructSnapped/2/1                          +0.0082         +0.0081            38            38            38            38
BM_S2PolygonGetEdge/0/0                                           +0.0142         +0.0145             3             3             3             3
BM_S2PolygonGetEdge/1/0                                           +0.0006         +0.0006             3             3             3             3
BM_S2PolygonGetEdge/2/0                                           +0.0091         +0.0093             3             3             3             3
BM_LaxPolygonGetEdge/0/0                                          -0.0052         -0.0053             3             3             3             3
BM_LaxPolygonGetEdge/1/0                                          -0.0051         -0.0051             3             3             3             3
BM_LaxPolygonGetEdge/2/0                                          +0.0014         +0.0014             4             4             4             4
BM_EncodedPolygonGetEdge/0/0                                      -0.0003         -0.0003             3             3             3             3
BM_EncodedPolygonGetEdge/1/0                                      -0.0010         -0.0011             2             2             2             2
BM_EncodedPolygonGetEdge/2/0                                      +0.0007         +0.0007             5             5             5             5
BM_EncodedPolygonGetEdge/0/1                                      +0.0065         +0.0065            28            28            28            28
BM_EncodedPolygonGetEdge/1/1                                      +0.0065         +0.0065            31            31            31            31
BM_EncodedPolygonGetEdge/2/1                                      +0.0072         +0.0072            33            33            33            33
BM_ContainsPointMutableIndexS2Polygon/0/0                         +0.0074         +0.0074           167           168           167           168
BM_ContainsPointMutableIndexS2Polygon/1/0                         -0.0007         -0.0007           179           179           179           179
BM_ContainsPointMutableIndexS2Polygon/2/0                         +0.0079         +0.0079           513           517           513           517
BM_ContainsPointEncodedIndexPolygon/0/0                           +0.0018         +0.0020           244           244           244           244
BM_ContainsPointEncodedIndexPolygon/1/0                           -0.0088         -0.0088           241           239           241           239
BM_ContainsPointEncodedIndexPolygon/2/0                           -0.0037         -0.0037           732           729           732           729
BM_ContainsPointEncodedIndexPolygon/0/1                           -0.0091         -0.0093           503           499           503           499
BM_ContainsPointEncodedIndexPolygon/1/1                           +0.0059         +0.0060           484           487           484           487
BM_ContainsPointEncodedIndexPolygon/2/1                           -0.0132         -0.0132          1848          1824          1848          1824
BM_VertexDistanceMutableIndexS2Polygon/0/0                        -0.0109         -0.0111           279           276           279           276
BM_VertexDistanceMutableIndexS2Polygon/1/0                        -0.0058         -0.0058           725           720           724           720
BM_VertexDistanceMutableIndexS2Polygon/2/0                        +0.0154         +0.0159          1905          1935          1904          1934
BM_VertexDistanceEncodedIndexPolygon/0/0                          -0.0381         -0.0382           359           346           359           346
BM_VertexDistanceEncodedIndexPolygon/1/0                          -0.0099         -0.0095           859           850           858           850
BM_VertexDistanceEncodedIndexPolygon/2/0                          -0.0136         -0.0139          2469          2435          2468          2434
BM_VertexDistanceEncodedIndexPolygon/0/1                          -0.0124         -0.0124           850           840           850           839
BM_VertexDistanceEncodedIndexPolygon/1/1                          -0.0076         -0.0076          1563          1551          1563          1551
BM_VertexDistanceEncodedIndexPolygon/2/1                          -0.0053         -0.0053          5153          5126          5153          5125
BM_DecodePolygonBuildIndexContainsPointSnapped/0/0                -0.0111         -0.0106          1505          1488          1504          1488
BM_DecodePolygonBuildIndexContainsPointSnapped/1/0                +0.0041         +0.0041        101370        101788        101359        101777
BM_DecodePolygonBuildIndexContainsPointSnapped/2/0                -0.0273         -0.0273        227031        220832        227008        220811
BM_DecodePolygonBuildIndexContainsPointSnapped/0/1                -0.0104         -0.0104          2057          2036          2057          2036
BM_DecodePolygonBuildIndexContainsPointSnapped/1/1                +0.0147         +0.0147        109824        111436        109813        111427
BM_DecodePolygonBuildIndexContainsPointSnapped/2/1                +0.0133         +0.0133        285840        289652        285811        289623
BM_DecodeIndexAndPolygonContainsPointSnapped/0/0                  -0.0143         -0.0143           545           537           545           537
BM_DecodeIndexAndPolygonContainsPointSnapped/1/0                  +0.0030         +0.0030         20178         20237         20176         20235
BM_DecodeIndexAndPolygonContainsPointSnapped/2/0                  -0.0097         -0.0091         24945         24704         24929         24701
BM_DecodeIndexAndPolygonContainsPointSnapped/0/1                  -0.0021         -0.0021          1096          1094          1096          1094
BM_DecodeIndexAndPolygonContainsPointSnapped/1/1                  +0.0160         +0.0160         29431         29902         29428         29900
BM_DecodeIndexAndPolygonContainsPointSnapped/2/1                  +0.0190         +0.0190         80886         82421         80878         82413
BM_EncodedIndexAndPolygonContainsPointSnapped/0/0                 -0.0132         -0.0132           304           300           304           300
BM_EncodedIndexAndPolygonContainsPointSnapped/1/0                 +0.0182         +0.0182           485           494           485           494
BM_EncodedIndexAndPolygonContainsPointSnapped/2/0                 -0.0152         -0.0152          1047          1031          1047          1031
BM_EncodedIndexAndPolygonContainsPointSnapped/0/1                 +0.0098         +0.0098           558           564           558           564
BM_EncodedIndexAndPolygonContainsPointSnapped/1/1                 +0.0052         +0.0052           837           841           837           841
BM_EncodedIndexAndPolygonContainsPointSnapped/2/1                 +0.0095         +0.0096          2364          2387          2364          2386
OVERALL_GEOMEAN                                                   -0.0012         -0.0012             0             0             0             0

@ericveach I think OpenSSL's philosophy was to minimize the variance in their operations to avoid timing attacks, they admit GMP is faster in general than their bignum.

@ericveach
Copy link

ericveach commented Sep 16, 2025 via email

@jmr
Copy link
Member

jmr commented Sep 16, 2025

There is probably still a
version of ExactFloat based on MPFR (and that supports a lot more
operations and features) buried in Perforce somewhere.

You might even be able to find it in the Google Code archive. [Update: No, it looks like it was removed before the first open-source release.] If not, I can upload it again if it's worth benchmarking against.

typedef MPFloat<6300> ExactFloat; was how it was used.

@smcallis could you run the BoringSSL benchmarks? Building with bazel should use that.

@smcallis
Copy link
Contributor Author

Yeah I'll run those, probably be a day or two though.

@smcallis
Copy link
Contributor Author

Here's vs boringSSL (no mean feat, the bazel build is pretty crufty at this point):

Benchmark                                                                    Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------------------------
BM_GetIntersection<&S2::GetIntersection>                                  +0.0126         +0.0126            18            19            18            19
BM_GetIntersection<&GetIntersectionStableLDIgnoreValidity>                -0.0387         -0.0387           133           128           133           128
BM_GetIntersection<&S2::internal::GetIntersectionExact>                   -0.5855         -0.5855          3159          1309          3159          1309
OVERALL_GEOMEAN                                                           -0.2611         -0.2611             0             0             0             0

Benchmark                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------
BM_MutableIndexBuildDestruct/0/0                                  -0.0050         -0.0050           781           777           781           777
BM_MutableIndexBuildDestruct/1/0                                  +0.0203         +0.0204        128513        131122        128436        131051
BM_MutableIndexBuildDestruct/2/0                                  -0.0089         -0.0089        189086        187402        189067        187383
BM_MutableIndexDecodeDestruct/0/0                                 +0.0072         +0.0072           111           112           111           112
BM_MutableIndexDecodeDestruct/1/0                                 +0.0274         +0.0274         12200         12535         12199         12533
BM_MutableIndexDecodeDestruct/2/0                                 +0.0021         +0.0021         11535         11559         11534         11558
BM_EncodedIndexInitDestruct/0/0                                   -0.0017         -0.0017            63            63            63            63
BM_EncodedIndexInitDestruct/1/0                                   +0.0064         +0.0064           274           276           274           276
BM_EncodedIndexInitDestruct/2/0                                   -0.0133         -0.0133           209           206           209           206
BM_S2PolygonDecodeDestructSnapped/0/0                             -0.0199         -0.0199           162           159           162           159
BM_S2PolygonDecodeDestructSnapped/1/0                             -0.0076         -0.0076          1396          1385          1395          1385
BM_S2PolygonDecodeDestructSnapped/2/0                             -0.0696         -0.0696         11807         10985         11806         10984
BM_S2PolygonDecodeDestructSnapped/0/1                             -0.0401         -0.0401           657           631           657           631
BM_S2PolygonDecodeDestructSnapped/1/1                             -0.0037         -0.0037         10827         10787         10826         10786
BM_S2PolygonDecodeDestructSnapped/2/1                             -0.0096         -0.0096         68054         67399         68047         67393
BM_LaxPolygonDecodeDestructSnapped/0/0                            -0.0044         -0.0044            28            28            28            28
BM_LaxPolygonDecodeDestructSnapped/1/0                            +0.0111         +0.0111          1556          1574          1556          1574
BM_LaxPolygonDecodeDestructSnapped/2/0                            -0.0031         -0.0031          2084          2078          2084          2077
BM_LaxPolygonDecodeDestructSnapped/0/1                            +0.0146         +0.0146           123           124           123           124
BM_LaxPolygonDecodeDestructSnapped/1/1                            -0.0029         -0.0029         11920         11885         11919         11884
BM_LaxPolygonDecodeDestructSnapped/2/1                            +0.0041         +0.0041         13832         13888         13831         13887
BM_EncodedPolygonInitDestructSnapped/0/0                          -0.0076         -0.0076             6             6             6             6
BM_EncodedPolygonInitDestructSnapped/1/0                          -0.0746         -0.0746             7             6             7             6
BM_EncodedPolygonInitDestructSnapped/2/0                          -0.0057         -0.0057             9             9             9             9
BM_EncodedPolygonInitDestructSnapped/0/1                          -0.0375         -0.0375            10            10            10            10
BM_EncodedPolygonInitDestructSnapped/1/1                          +0.0337         +0.0337            44            46            44            46
BM_EncodedPolygonInitDestructSnapped/2/1                          +0.0097         +0.0097            59            59            59            59
BM_S2PolygonGetEdge/0/0                                           -0.0039         -0.0039             2             2             2             2
BM_S2PolygonGetEdge/1/0                                           -0.0066         -0.0066             2             2             2             2
BM_S2PolygonGetEdge/2/0                                           +0.0087         +0.0087             2             2             2             2
BM_LaxPolygonGetEdge/0/0                                          -0.1090         -0.1090             2             1             2             1
BM_LaxPolygonGetEdge/1/0                                          -0.0020         -0.0020             2             2             2             2
BM_LaxPolygonGetEdge/2/0                                          +0.0120         +0.0120             2             2             2             2
BM_EncodedPolygonGetEdge/0/0                                      -0.0034         -0.0034             2             2             2             2
BM_EncodedPolygonGetEdge/1/0                                      +0.0225         +0.0225             2             2             2             2
BM_EncodedPolygonGetEdge/2/0                                      +0.0099         +0.0099             5             5             5             5
BM_EncodedPolygonGetEdge/0/1                                      -0.0020         -0.0020            27            27            27            27
BM_EncodedPolygonGetEdge/1/1                                      +0.0110         +0.0110            30            30            30            30
BM_EncodedPolygonGetEdge/2/1                                      -0.0072         -0.0072            31            31            31            31
BM_ContainsPointMutableIndexS2Polygon/0/0                         +0.0176         +0.0176           166           169           166           169
BM_ContainsPointMutableIndexS2Polygon/1/0                         -0.0021         -0.0021           182           181           182           181
BM_ContainsPointMutableIndexS2Polygon/2/0                         -0.0034         -0.0034           514           512           514           512
BM_ContainsPointEncodedIndexPolygon/0/0                           +0.0168         +0.0169           240           244           240           244
BM_ContainsPointEncodedIndexPolygon/1/0                           +0.0276         +0.0276           231           237           231           237
BM_ContainsPointEncodedIndexPolygon/2/0                           -0.0052         -0.0053           723           719           723           719
BM_ContainsPointEncodedIndexPolygon/0/1                           +0.0462         +0.0462           475           497           475           497
BM_ContainsPointEncodedIndexPolygon/1/1                           +0.0567         +0.0567           456           482           456           482
BM_ContainsPointEncodedIndexPolygon/2/1                           +0.0382         +0.0382          1749          1816          1749          1816
BM_VertexDistanceMutableIndexS2Polygon/0/0                        +0.0125         +0.0125           279           282           279           282
BM_VertexDistanceMutableIndexS2Polygon/1/0                        -0.0154         -0.0154           721           710           721           710
BM_VertexDistanceMutableIndexS2Polygon/2/0                        -0.0088         -0.0088          1906          1889          1906          1889
BM_VertexDistanceEncodedIndexPolygon/0/0                          +0.0219         +0.0219           343           351           343           351
BM_VertexDistanceEncodedIndexPolygon/1/0                          -0.0044         -0.0044           847           843           847           843
BM_VertexDistanceEncodedIndexPolygon/2/0                          -0.0156         -0.0156          2420          2382          2420          2382
BM_VertexDistanceEncodedIndexPolygon/0/1                          +0.0131         +0.0131           836           847           836           847
BM_VertexDistanceEncodedIndexPolygon/1/1                          -0.0044         -0.0044          1534          1528          1534          1528
BM_VertexDistanceEncodedIndexPolygon/2/1                          +0.0006         +0.0006          5028          5030          5027          5030
BM_DecodePolygonBuildIndexContainsPointSnapped/0/0                -0.0017         -0.0017          1183          1181          1183          1181
BM_DecodePolygonBuildIndexContainsPointSnapped/1/0                -0.0075         -0.0076        100745         99985        100735         99974
BM_DecodePolygonBuildIndexContainsPointSnapped/2/0                -0.0733         -0.0732        230325        213454        230288        213431
BM_DecodePolygonBuildIndexContainsPointSnapped/0/1                +0.0135         +0.0135          1690          1713          1690          1712
BM_DecodePolygonBuildIndexContainsPointSnapped/1/1                -0.0113         -0.0113        110855        109605        110844        109595
BM_DecodePolygonBuildIndexContainsPointSnapped/2/1                -0.0106         -0.0106        281871        278891        281840        278862
BM_DecodeIndexAndPolygonContainsPointSnapped/0/0                  +0.0055         +0.0055           472           475           472           475
BM_DecodeIndexAndPolygonContainsPointSnapped/1/0                  +0.0074         +0.0074         20949         21105         20947         21103
BM_DecodeIndexAndPolygonContainsPointSnapped/2/0                  -0.0076         -0.0076         25006         24817         25003         24815
BM_DecodeIndexAndPolygonContainsPointSnapped/0/1                  +0.0068         +0.0068           989           996           989           996
BM_DecodeIndexAndPolygonContainsPointSnapped/1/1                  -0.0029         -0.0029         30555         30465         30552         30462
BM_DecodeIndexAndPolygonContainsPointSnapped/2/1                  -0.0019         -0.0019         81276         81124         81268         81116
BM_EncodedIndexAndPolygonContainsPointSnapped/0/0                 -0.0012         -0.0012           304           304           304           304
BM_EncodedIndexAndPolygonContainsPointSnapped/1/0                 -0.0145         -0.0145           559           551           559           551
BM_EncodedIndexAndPolygonContainsPointSnapped/2/0                 -0.0014         -0.0014          1076          1075          1076          1075
BM_EncodedIndexAndPolygonContainsPointSnapped/0/1                 +0.0012         +0.0012           556           557           556           557
BM_EncodedIndexAndPolygonContainsPointSnapped/1/1                 -0.0028         -0.0028           925           922           925           922
BM_EncodedIndexAndPolygonContainsPointSnapped/2/1                 -0.0049         -0.0049          2426          2414          2426          2414
OVERALL_GEOMEAN                                                   -0.0025         -0.0025             0             0             0             0

Benchmark                                       Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------
BM_BufferPoints/1                            +0.0116         +0.0116         18663         18880         18661         18878
BM_BufferPoints/10                           +0.0015         +0.0015        215129        215446        215106        215424
BM_BufferPoints/1000                         +0.0102         +0.0102      22882009      23114805      22879552      23112244
BM_BufferConvexLoop/10/10                    +0.0061         +0.0061         28738         28915         28735         28912
BM_BufferConvexLoop/1000/10                  +0.0122         +0.0122       2553969       2585060       2553696       2584797
BM_BufferConvexLoop/1000/1000                +0.0112         +0.0112       2517311       2545442       2517047       2545186
BM_BufferConcaveLoop/10/10                   -0.0013         -0.0013         36083         36037         36079         36033
BM_BufferConcaveLoop/1000/10                 +0.0042         +0.0042       5138088       5159517       5137576       5158957
BM_BufferConcaveLoop/1000/100                -0.0019         -0.0019       8779280       8762583       8778229       8761603
BM_BufferLoDimFractal/48                     +0.0066         +0.0066        174485        175645        174468        175628
BM_BufferLoDimFractal/3072                   -0.0014         -0.0014      34644955      34596695      34641124      34592778
BM_BufferHiDimFractal/48                     -0.0258         -0.0258        257804        251161        257777        251135
BM_BufferHiDimFractal/3072                   -0.0016         -0.0016      28954233      28907645      28950900      28904357
OVERALL_GEOMEAN                              +0.0024         +0.0024             0             0             0             0

@jmr
Copy link
Member

jmr commented Sep 18, 2025

It looks like the cases that need ExactFloat are very uncommon. I don't see any ExactFloat samples when looking for S2Builder::SnapEdge in GWP.

The most cycles I can find in ExactFloat from S2 seem to be because of ExactCircleEdgeIntersectionSign -> ExactFloat::operator- -> ExactFloat::SignedSum and it's tiny.

I would rather optimize for ease of maintenance and not have to understand and carry this bignum code.

I'm not sure how "targeting things like WASM are simpler" with this, since WASM already works. Are there other reasons we should want our own bignum?

@ericveach
Copy link

ericveach commented Sep 18, 2025 via email

@smcallis
Copy link
Contributor Author

This was admittedly scratching my own itch. You have to have an openssl compiled with wasm installed before you can compile S2 with wasm, and it's known to be obnoxious. I'm travelling now but when I get back I'll see if I can get some estimates on how much smaller the wasm binary is without OpenSSL too.

@jmr
Copy link
Member

jmr commented Sep 19, 2025

BTW I don't think it's worth anyone's effort to dig up the MPFR-based ExactFloat

It actually wasn't that hard. I had already done it, just hadn't uploaded the results.

Base is OpenSSL, experiment is MPFL:
exactfloat-vs-mpfloat-elapsed.txt

Mostly small differences, but BM_GetIntersection<&S2::internal::GetIntersectionExact> is 30% faster with MPFloat.

I won't be distributing the MPFloat code.

I will make a new release with the ExactFloat tests. Benchmarks, too, if I have time.

@smcallis
Copy link
Contributor Author

Managed to get my WASM build up and running again, the output binary is ~10% smaller without OpenSSL in there. I'm sure that effect's larger for just S2 itself too.

@jmr
Copy link
Member

jmr commented Sep 22, 2025

the output binary is ~10% smaller without OpenSSL in there

What are the raw numbers?

@smcallis
Copy link
Contributor Author

smcallis commented Sep 22, 2025 via email

@jmr
Copy link
Member

jmr commented Sep 24, 2025

It turns out S2 is no longer the only (or even main) user of ExactFloat, so I have to do some additional testing.

Can you:

  1. rewrite the commit author to use your Google address
  2. go over the code yourself as if you were code reviewing it
  3. add your benchmarks, ifdefed out is fine for now
  4. rebase onto the new version I just pushed?

@jmr jmr changed the title Add bignum.h and port exactfloat to use it. exactfloat: Remove OpenSSL dependency Sep 24, 2025
@smcallis
Copy link
Contributor Author

It turns out S2 is no longer the only (or even main) user of ExactFloat, so I have to do some additional testing.

Can you:

  1. rewrite the commit author to use your Google address
  2. go over the code yourself as if you were code reviewing it
  3. add your benchmarks, ifdefed out is fine for now
  4. rebase onto the new version I just pushed?

Yeah I'll do all these, I'm employed with Google through October 5th so after that it won't make sense to use my google email though.

@jmr
Copy link
Member

jmr commented Oct 21, 2025

I will test this in google3 tomorrow.

@jmr
Copy link
Member

jmr commented Oct 22, 2025

I will test this in google3 tomorrow.

This is being tested now. This review made me realize that there were some interface fixes that could be done, so I will export a new version and you will have merge conflicts to resolve.

@jmr
Copy link
Member

jmr commented Oct 24, 2025

This is being tested now.

Testing looks good.

so I will export a new version and you will have merge conflicts to resolve.

This now has merge conflicts. You can either rebase or merge. I'm just going to squash and merge this in the end.

@smcallis
Copy link
Contributor Author

Should be merged now.

ExactFloat::ExactFloat(const ExactFloat& b)
: sign_(b.sign_), bn_exp_(b.bn_exp_) {
BN_copy(bn_.get(), b.bn_.get());
if (v == std::numeric_limits<int>::min()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant you can replace this whole if/else block by:

// This works for `INT_MIN` since `abs(INT_MIN) == INT_MIN == -2^31` and
// casting to `unsigned` is modulo 2^32. `-2^31 mod 2^32 == 2^31`,
// the desired result.
bn_ = Bignum(static_cast<unsigned>(std::abs(v)));`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that, since std::abs returns a signed value, that std::abs(INT_MIN) will throw a UBSAN error: https://godbolt.org/z/8TnndTsqP

And you can't just static_cast because e.g. static_cast<unsigned>(-5) == 4294967291

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so abs(INT_MIN) is actually UB. Then I suggest __attribute__((no_sanitize("signed-integer-overflow"))).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you want me to leave the UB and just do a static_cast but disable the warning?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, either that or figure out how to apply -fwrapv to this one function. Since we have a test case for INT_MIN I'm not very worried about this UB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eric's idea sounds good.

// Calculates abs(v) without UB.  SafeAbs(INT_MIN) == INT_MIN.
// Generates the same code as std::abs().
// https://godbolt.org/z/eT6KW1zGb
int SafeAbs(int v) {
  return v < 0 ? -static_cast<unsigned>(v) : v;
}

@jmr
Copy link
Member

jmr commented Oct 25, 2025

@ericveach There are ~15 unresolved conversations here where smcallis has responded. Can you (ericv) check that the latest code is ok and either resolve them or request further changes?

@ericveach
Copy link

ericveach commented Oct 30, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants