Proj.4 versus Apache SIS: an accuracy comparison

In August 17th 2017, we presented an introduction to Apache SIS in the Free and Open Source Software for Geospatial (FOSS4G) conference. A discussion about accuracy were planed, but skipped because of lack of time. The results that we intended to show are below.

Disclaimer: the author of this blog is an Apache SIS contributor.

Summary

The accuracy of two map projection libraries were compared:

  • Proj.4, a library in the C language.
  • Apache Spatial Information System (SIS), a library in the Java language.

The benchmark code was written in Java. The code measured both performance and accuracy. Those two measurements were done together in order to ensure that performance comparisons were applied on equivalent coordinate operations. Results of performance comparisons were presented in a previous blog post. Present blog post is about the accuracy part.

Tests done for this blog post expand on the Geospatial Integrity of Geoscience Software (GIGS) tests procedure. We require an accuracy of one centimetre, unless the coordinate operation is documented in the EPSG database as having a larger accuracy. In every cases, we require the results after 100 “forward projection – inverse projection” round trips to drift by less than one meter. However while we used some procedures, we didn’t used GIGS data for this blog. A subset of GIGS tests is executed by the Apache SIS project and a smaller subset is executed in the GeoAPI project for Proj4 and UCAR libraries, but they need to be expanded before further discussion.

The ISO 19111 standard recognizes two broad categories of coordinate operations: conversionsand transformations. Those two categories need to be analyzed separately.

Coordinate conversions

This category includes map projections. For all tested conversions, Apache SIS and Proj.4 are in close agreement. Both implementations show drifts much smaller than one meter (and almost always smaller than 1 millimetre) after 100 “forward projection – inverse projection” round trips. Apache SIS has smaller drifts for projections implemented by exact formulas (mostly projections on sphere), presumably because of stricter trigonometric functions in Java compared to C/C++ and because of Apache SIS aggressive concatenation of linear steps. Proj.4 has smaller drifts (except for Cylindrical Equal Area“cea”) for projections on Earth approximated by series expansions or iterative algorithms, presumably because of more conservative formulas in series expansions and because of iteration stop conditions controlled by smaller tolerance thresholds. However in both libraries the drift still much smaller than the accuracy requested by EPSG.

For map projections on planets having higher eccentricity than Earth (e.g. Jupiter), two behaviors are observed: map projections implemented by iterative algorithms keep a stable accuracy, at the cost of more iterations. But map projections implemented by series expansions have growing errors as the eccentricity increases. For a planet flattened like Jupiter but scaled at Earth size (only for comparison purpose with other projections in this test), the Proj.4 “cea” projection has a drift of 9 km after 100 iterations. Apache SIS drift for the same projection is still below a millimetre, because of SIS dual implementation (series expansion completed by iterations when necessary). Such dual implementation is possible for Mercator (not transverse) and a few other projections.

Coordinate transformations

This category includes datum shifts. Apache SIS and Proj.4 produce the same results when they use the same operation method with the same Bursa-Wolf parameters, but their results can differ by 1 or 2 meters when those two libraries selected different parameter values. The Bursa-Wolf parameters are the same for tested datum shifts from/to WGS84 except in North America. But the parameters differ for datum shifts between other datums than WGS84, for example Martinique 1938 to RGAF09. In all tested cases, Apache SIS uses the Bursa-Wolf parameters as specified by the EPSG geodetic dataset (late-binding approach) while the Proj.4 early-binding approach puts a 0.9 ± 0.4 meter error in Martinique case, which is about 10 times the expected accuracy. Proj.4 does not give any indication to users about such inaccuracy; currently, only informed users could suspect a problem by inspecting the Proj.4-specific +towgs84 elements in definition strings and guessing how Proj.4 may use them.

For coordinate transformations in North America, Apache SIS and Proj.4 may agree or disagree depending on the geographic area. The EPSG registry does not provide Bursa-Wolf parameters covering North America as a whole (both USA and Canada). But it provides parameters for smaller areas, for example a particular state. Apache SIS takes the geographic area in account when selecting Bursa-Wolf parameters, while Proj.4 API does not have this option. Proj.4 may use NADCON grids instead, but has no standard way to tell what it actually does. For this reason, while both Proj.4 and Apache SIS can handle NADCON grids, their usages have not been compared yet.

Material and method

The comparison has been performed between Proj.4 release 4.9.3 (August 2016) and Apache SIS 0.8-jdk8-SNAPSHOT (August 2017). The environment is Java 1.8.0_144-b01 on MacOS 10.12.6. The comparison program uses the GeoAPI interfaces. GeoAPI 3.0 is an OGC standard which allows to write code once and run it on different implementations. It is similar in this respect to Java DataBase Connectivity (JDBC) interfaces. The “Proj.4 versus Apache SIS: a performance comparison” blog post gives more details about GeoAPI implementations, which Coordinate Reference System (CRS) are used and how operations are instantiated. The table of CRS is repeated here for convenience:

SOURCE CRS TARGET CRS
EPSG:4053 – International 1924 Authalic Sphere
EPSG:3410 – NSIDC EASE-Grid Global
EPSG:4326 – WGS 84
EPSG:6933 – WGS 84 / NSIDC EASE-Grid 2.0 Global
EPSG:4326 – WGS 84
EPSG:3857 – WGS 84 / Pseudo-Mercator
EPSG:4326 – WGS 84
EPSG:3395 – WGS 84 / World Mercator
EPSG:4269 – NAD83
EPSG:3978 – NAD83 / Canada Atlas Lambert
EPSG:4326 – WGS 84
EPSG:3031 – WGS 84 / Antarctic Polar Stereographic
EPSG:4269 – NAD83
EPSG:5070 – NAD83 / Conus Albers
EPSG:3994 – WGS 84 / Mercator 41
EPSG:3395 – WGS 84 / World Mercator
EPSG:4301 – Tokyo
EPSG:4612 – JGD2000
EPSG:4277 – OSGB 1936
EPSG:4230 – ED50
EPSG:4625 – Martinique 1938
EPSG:5489 – RGAF09
Tested coordinate operation pairs

Note that Transverse Mercator projections are not tested yet, for reasons given later. Some statements in the following paragraphs do not apply to Transverse Mercator, Krovak and a few other projection methods.

All forward projections tested in this benchmark use exact formulas. Implementation of those formulas in Proj.4 and Apache SIS are assumed correct; we do not verify the projection results correctness (such verification is the topic of another test suite, based on Geospatial Integrity of Geoscience Software tests). The inverse operations however are often more complex, and most inverse projections have no exact formulas. They use approximations given in two ways:

  • By series expansion: there is no programmatic loop and the number of terms is hard-coded. Apache SIS and Proj.4 use the same number of terms, as given in Snyder’s book (Map projections – a working manual) and in EPSG guidance notes.

  • By iterative algorithm: a loop is executed until the change between two iterations is below a tolerance threshold. Apache SIS uses a larger tolerance threshold than Proj.4, thus saving a few iterations at the expense of accuracy.

Those approximations produce the same results (in the limit of the accuracy required by EPSG) for Mercator (not Transverse), Lambert Conic Conformal, Stereographic, Albers Equal Area and Cylindrical Equal Area projections. Consequently we do not consider whether Apache SIS and Proj.4 use the same approximations. We do not include the Transverse Mercator projection in this analysis because series expansions and iterative algorithms published for that projection are known to produce different results, which leads EPSG to consider them as different map projection methods.

For all comparisons done, we verify that Proj.4 and Apache SIS perform the same coordinate operation (ignoring the approximation algorithms described above) by looking at the Well Known Text 2 (ISO 19162) representation. WKT 2 specification defines a representation not only for Coordinate Reference Systems, but also for Coordinate Operations between two CRS. The CRS WKT tells us what were the inputs and what are the outputs after the fact, but does not tell us what happened between the two. The coordinate operation WKT allows us to inspect what the library actually does. For example in the case of a transformation from EPSG:4269 to EPSG:3978, Apache SIS gives us:

CoordinateOperation["Canada Atlas Lambert",
  SourceCRS[…definition omitted for brevity…, Id["EPSG", 4269, "9.1"]]],
  TargetCRS[…definition omitted for brevity…, Id["EPSG", 3978, "9.1"]]],
  Method["Lambert Conic Conformal (2SP)"],
    Parameter["semi_major", 6378137.0, Unit["metre", 1]],
    Parameter["semi_minor", 6356752.314140356, Unit["metre", 1]],
    Parameter["Latitude of false origin", 49.0, Unit["degree", 0.017453292519943295]],
    Parameter["Longitude of false origin", -95.0, Unit["degree", 0.017453292519943295]],
    Parameter["Latitude of 1st standard parallel", 49.0, Unit["degree", 0.017453292519943295]],
    Parameter["Latitude of 2nd standard parallel", 77.0, Unit["degree", 0.017453292519943295]],
    Parameter["Easting at false origin", 0.0, Unit["metre", 1]],
    Parameter["Northing at false origin", 0.0, Unit["metre", 1]],
    …scope, area and remarks omitted for brevity…
  BBox[40.04, -141.01, 86.46, -47.74],
  Id["EPSG", 3977, "9.1"]]

Complete WKT 2 output can be obtained on the command-line as below (require Apache SIS 0.7 or later):

sis transform --sourceCRS EPSG:4269 --targetCRS EPSG:3978 --verbose

For an operation backed by Proj.4, we get the description shown below. Since GDAL/Proj.4 does not support WKT 2 yet, Apache SIS does part of the work by formatting the CoordinateOperation, SourceCRS and TargetCRS elements itself using the information provided by Proj.4 through the GeoAPI interfaces. But for the Parameter elements, SIS simply lists the Proj.4 parameters verbatim without translating them to EPSG parameters (that translation can be seen in the SourceCRS and TargetCRS elements).

CoordinateOperation["From NAD83 to NAD83 +proj=lcc",
  SourceCRS[…definition omitted for brevity…],
  TargetCRS[…definition omitted for brevity…],
  Method["pj_transform"],
    Parameter["srcdefn", "+init=EPSG:4269 +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0"],
    Parameter["dstdefn", "+init=EPSG:3978 +proj=lcc +lat_1=49 +lat_2=77 +lat_0=49 +lon_0=-95 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"]]

Complete WKT 2 output can be obtained on the command-line as below (require Apache SIS 0.8-SNAPSHOT or later):

sis transform --sourceCRS Proj4::+init=EPSG:4269 --targetCRS Proj4::+init=EPSG:3978 --verbose

Equivalence between the Proj.4 and EPSG parameter names is given below for the lccLambert Conic Conformal (2SP) projection method. Note that EPSG parameter names vary with projection methods; a more complete mapping is given here.

PROJ.4 ESPG NAME
lon_0
Longitude of false origin
lat_o
Latitude of false origin
lat_1
Latitude of 1st standard parallel
lat_2
Latitude of 2nd standard parallel
x_0
Easting at false origin
y_0
Northing at false origin
Name mapping for Lambert Conic Conformal (2SP) projection

Using those WKT representations and equivalence tables, we verified that all coordinate conversions (map projections) tested here use the same parameter values. Coordinate transformations however sometime use different values. This will be discussed in the next section.

Since forward projections are assumed exact (ignoring rounding errors) in this blog, errors are assumed to occur mostly in inverse projections and in the choice of Bursa-Wolf parameters. Those errors are evaluated in two different ways:

  • Errors in inverse projections: those errors are evaluated by applying forward projections (assumed exact) followed by inverse projections and measuring the distance with original coordinate points.

  • Errors caused by datum shift parameters choice: those errors are identified by comparing the Bursa-Wolf parameter values selected by tested libraries against the parameter values given by the EPSG registry. The library in agreement with EPSG is taken as the reference implementation used for estimating the error of the other library.

A related test is to apply 100 “forward operation – inverse operation” round trips and verifying that all final coordinates are less than 1 metre away from the original coordinates. This is not necessarily equivalent to verifying that coordinates are less than 1 centimetre away after a single “forward projection – inverse projection” round trip because the drift does not always increase linearly with the number of round trips. Furthermore we require the drift to be less than 1 meter regardless the operation accuracy (it can even be many meters) because the drift test measures internal consistency (or numerical stability), not the absolute operation accuracy.

The source code used for this test and for benchmarking is available in the Apache SIS project as part of a Maven project.

Results and discussion

For all coordinate operations, we measured how the drift increases with the number of round-trips. Our data show only two behaviors: either the drift increases linearly (as we would intuitively expect), or either the drift increases asymptomatically, sometime with oscillations around a value. An example is given below for the Albers Equal Area projection:

Some projections (e.g. Pseudo-Mercator) takes more time to reach a plateau, but in every tested cases the plateau (when it exists) is reached before 50 round trips. Consequently the GIGS recommendation to tests 100 round trips seems more than sufficient; we found no need to test 1000 round trips for instance. In every tested cases, the drift after 100 round trips stay below one meter. Numbers are below:

OPERATION SIS DRIFT (MM) PROJ.4 DRIFT (MM) SIS / PROJ.4 TREND
Cylindrical Equal Area (Spherical)
< 1
< 1
0,71
Asymptotic
Cylindrical Equal Area (Ellipsoidal)
< 1
64 ± 55
0,00
Linear
Pseudo Mercator
< 1
< 1
0,59
Asymptotic
Mercator
< 1
< 1
14,64
Linear
Lambert Conic Conformal
< 1
< 1
3,53
Linear
Polar Stereographic
< 1
< 1
15,85
Linear
Albers Equal Area
< 1
< 1
0,37
Asymptotic
Mercator 41°S → World Mercator
< 1
< 1
0,00
Linear
Geocentric translations
418 ± 290
418 ± 290
1,00
Linear
Position Vector transformation
9 ± 6
14 ± 6
0,60
Linear
Position Vector transformation
255 ± 7
264 ± 8
0.97
Linear
Drift after 100 round trips

Apache SIS shows a smaller drift than Proj.4 (SIS ∕ Proj.4 < 1) for map projections implemented by exact formulas (e.g. projections on a sphere). This may be explained by stricter trigonometric functions in Java compared to C/C++ (at the expense of performance), or by the more aggressive decomposition of map projections in linear and non-linear components done by Apache SIS (this decomposition reduces a little bit the rounding errors). Proj.4 shows a smaller drift (SIS ∕ Proj.4 > 1) for map projections implemented by iterative algorithms, maybe because Proj.4 uses smaller tolerance threshold than Apache SIS. But in all cases, the drift is well below EPSG requirement.

Comparison of transformation results

For map projections (when no datum shift is involved), Proj.4 and Apache SIS are in agreement within a nanometre. For coordinate transformations (i.e. operations involving a datum shift), Proj.4 and Apache SIS are also in agreement within a nanometre if one of the source or target datum is WGS84, but can disagree by 1 or 2 metres if neither the source and target datum is WGS84. This difference is explained by the architectural choice of the two libraries: Proj.4 is an early-bindingimplementation using WGS84 as a pivot datum (except when using datum shift grid files like NADCON), while Apache SIS is a late-binding implementation with no pivot datum. Instead, Apache SIS fetches the datum-shift parameters from the EPSG geodetic dataset for any pair of Coordinate Reference System (CRS) with no requirement to go through the WGS84 datum. This may result in a choice of different Bursa-Wolf parameter values. For example for a transformation from EPSG:4277 to EPSG:4230, Apache SIS gives us:

CoordinateOperation["OSGB 1936 to ED50 (UKOOA)",
  SourceCRS[…definition omitted for brevity…, Id["EPSG", 4277, "9.1"]]],
  TargetCRS[…definition omitted for brevity…, Id["EPSG", 4230, "9.1"]]],
  Method["Position Vector transformation (geog2D domain)"],
    Parameter["X-axis translation", 535.948, Unit["metre", 1]],
    Parameter["Y-axis translation", -31.357, Unit["metre", 1]],
    Parameter["Z-axis translation", 665.16, Unit["metre", 1]],
    Parameter["X-axis rotation", 0.15, Unit["arc-second", 4.84813681109536E-6]],
    Parameter["Y-axis rotation", 0.247, Unit["arc-second", 4.84813681109536E-6]],
    Parameter["Z-axis rotation", 0.998, Unit["arc-second", 4.84813681109536E-6]],
    Parameter["Scale difference", -21.689, Unit["parts per million", 1.0E-6]],
  OperationAccuracy[2.0],
    …scope, area and remarks omitted for brevity…
  BBox[49.79, -8.82, 60.94, 1.92],
  Id["EPSG", 1315, "9.1"]]

For Proj.4

CoordinateOperation["From OSGB36 to intl",
  SourceCRS[…definition omitted for brevity…],
  TargetCRS[…definition omitted for brevity…],
  Method["pj_transform"],
    Parameter["srcdefn", "+init=epsg:4277 +over +proj=longlat +datum=OSGB36 +no_defs +ellps=airy +towgs84=446.448,-125.157,542.060,0.1502,0.2470,0.8421,-20.4894"],
    Parameter["dstdefn", "+init=epsg:4230 +over +proj=longlat +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +no_defs"]]

Proj.4 does not tell us which Bursa-Wolf parameter values are effectively used; we assume that Proj.4 uses the parameters listed in srcdefn +towgs84 element, followed by the inverse of the parameters listed in dstdefn +towgs84 element. Those parameters are repeated in the two first rows of the table below. Since dstdefn parameters have no rotation and no scale difference, this sequence of operations is mathematically equivalent to performing a single operation using srcdefn parameters minus dstdefn parameters. This difference is given in the “Proj.4 presumed” row below. The parameters used by Apache SIS are the ones given in the EPSG geodetic dataset for coordinate operation EPSG:1315, and are given in the last row.

TX TY TZ RX RY RZ ΔS
Proj.4 srcdefn:
446.448
-125.157
542.060
0.1502
0.2470
0.8421
-20.4894
Proj.4 dstdefn:
-87
-98
-121
Proj.4 presumed:
533.448
-27,157
663,060
0,1502
0,2470
0,8421
-20,4894
Apache SIS:
535,948
-31,357
665,160
0,1500
0,2470
0,9980
-21,6890
Bursa-Wolf parameters for OSGB 1936 to ED50 transformation

We can see that the parameters obtained by early-binding approach are – in this case – close but not identical to the parameters obtained by late-binding approach. The average difference between Proj.4 and Apache SIS for points in the EPSG:1315 domain of validity (given by BBoxelement in above WKT) is 2.5 metres with a standard deviation of 0.6 metre. This is greater than the expected accuracy for EPSG:1315 transformation, which is 2 metres. If this particular case, given that EPSG:1315 is defined in the EPSG geodetic registry as “OSGB36 to WGS84 (Petroleum)” (EPSG:1314) minus “ED50 to WGS84 (Common Offshore)” (EPSG:1311), one may argue that the choice performed by Proj.4 is as valid. However Proj.4 gives no information about the domain of validity and accuracy of its choice. The user may think that correct transformation has been chosen or centimetric accuracy is achieved when it is not the case.

Another example is the transformation from Martinique 1938 (EPSG:4625) to RGAF09 (EPSG:5489). In this case, different possible methods (Molodensky, Molodensky Abridged and Geocentric Translation) produces slightly different results, and Proj.4 does not tell us which method it uses. Empirical comparisons with Apache SIS suggest that Proj.4 uses the geocentric translations method. The parameter values selected by Proj.4 and Apache SIS are shown below. The Apache SIS parameter values are taken from EPSG:5491 operation:

TX TY TZ RX RY RZ ΔS
Proj.4 srcdefn:
186
482
151
Proj.4 dstdefn:
Proj.4 presumed:
186
482
151
Apache SIS:
127,744
547,069
118,359
-3,1116
4,9509
-0,8837
14,1012
Bursa-Wolf parameters for Martinique 1938 to RGAF09 transformation

The average difference between Proj.4 and Apache SIS for points in the EPSG:5491 domain of validity is 0.9 metre with a standard deviation of 0.4 metre. This is greater than the expected accuracy for this transformation, which is 0.1 metre. In this example and contrarily to the “OSGB 1936 to ED50 (UKOOA)” example, the parameter values published by EPSG were determined directly by the national mapping agency without using WGS84 as a proxy. Any indirect transformations arrived through concatenation through WGS84 can only put error into the output. Consequently in this particular example, we can said that Proj.4 has an error of about one meter. This error is not caused by a bug that could be easily fixed, but rather by architectural choice (early-binding approach).

Transformations in North America

Many coordinate operations may exist for the same pair of source and target CRS. For example the EPSG geodetic dataset contains about 85 operations from NAD27 to WGS84, each operation using a different set of parameter values for different state or geographic area. For making a choice, the Apache SIS CRS class provides a convenience method:

CoordinateOperation op = CRS.findOperation(sourceCRS, targetCRS, areaOfInterest);

The areaOfInterest option is set to null in this test, which default to the widest domain of validity. Note that in the case of NAD27 to WGS84 transformations, the “widest domain of validity” criterion causes Apache SIS to select a transformation for Canada, not for USA. In order to avoid unfitted transformations, specifying a non-null areaOfInterest is recommended. The Proj.4 API does not have such option; maybe Proj.4 uses NADCON grids instead than Bursa-Wolf parameters for better accuracy, but we currently have no easy way to verify what Proj.4 is doing because of lack of WKT support for CoordinateOperation objects (Apache SIS compensates this hole for Proj.4 ProjectedCRS, but still can’t guess the details of Proj.4 operations). In order to avoid ambiguity, “NAD27 to WGS84” transformations have not been compared in this tests. Their comparisons will need further work.

Map projections on planetary CRS

For map projections on Earth, both libraries have shown drift smaller than 1 millimetre after 100 round trips. But for map projections on more flattened planets, two behaviors were observed:

  • Map projections implemented by iterative algorithms keep a stable accuracy. Those algorithms compensate higher eccentricities with more iterations. However the calculation do not converge anymore if the eccentricity is too high.

  • Map projections implemented by hard-coded series expansions have their accuracy degraded, has shown by increasing drift as the planet eccentricity is increased. In general, there is no warning that the accuracy is degraded.

For some map projections, the two above-cited approaches are available and produce the same results on Earth. This is the case of Lambert Conic Conformal for instance, but not the case of Transverse Mercator. When both algorithms are equivalent, series expansions are often preferred because they are faster than iterative algorithms. But series expansions are also less robust to extra-terrestrial CRS. In this blog post, we verified the effect on a planet flattened like Jupiter but scaled to the size of Earth for easier distance comparisons. The Jupiter inverse flattening factor is about 15.414, compared to 298.26 for Earth. All tested map projections except Cylindrical Equal Area have keep a centimetric accuracy for a single round trip, either because they are implemented by iterative methods or because the number of terms in their series expansions is still sufficient for Jupiter flattening factor. But in Proj.4 "cea" (Cylindrical Equal Area) projection, we observed an average error of 90 metres per round-trip (9 kilometres after 100 round trips), while Apache SIS keep a centrimetric accuracy. Both Proj.4 and Apache SIS uses the same series expansion with the same number of terms, but Apache SIS has dual implementation for that projection: if, after completion of series expansion, the accuracy is still insufficient, then Apache SIS continues the calculation with iterative algorithm. Apache SIS implements the same dual approach for Lambert Conic Conformal and a few other projections.

Conclusion and perspective

Apache SIS and Proj.4 are in agreement for all coordinate conversions (e.g. map projections) tested in this blog on a planet flattened like Earth. But they sometime disagree by 2 meters for coordinate transformations (e.g. datum shifts) on Earth and 90 meters for map projections on a Earth-sized planet flattened like Jupiter. In two of the three disagreements, the library providing the correct answer is Apache SIS. In the third disagreement, we may argue that both libraries are not wrong provided that distances between Proj.4 results and Apache SIS results are smaller than the transformation accuracy. But Proj.4 gives no indication about the accuracy of its operations.

The degree of confidence that we can put in a coordinate operation library depends on many factors, some at Proj.4 advantage and others at Apache SIS advantage:

Proj.4 advantage:

  • Long history and large users base, which in theory increases the chances that bugs have been identified and fixed. However Proj.4 included GIGS in its test suite only recently, and as of September 2017 is known to produce different results for some GIGS tests (this issue has also been observed years ago with Proj.4 wrappers in GeoAPI). Since Proj.4 is used by MapServer, PostGIS, QGIS and since Proj4J and Proj4JS are derived from Proj.4, any potential error in the library may not be immediately visible when using only above-cited software if they all reproduce the same errors.

Apache SIS advantages:

  • Run subset of GIGS tests since 2011 (indirectly). Apache SIS either passes those tests or have identified corrections that need to be applied to GIGS tests. However the tests implemented in GeoAPI and Apache SIS are incomplete.

  • Standard (ISO 19162) and more detailed description of coordinate operations. This capability allows users to verify what the library is actually doing (e.g. to see the effective Bursa-Wolf parameters). This information is often necessary when investigating why data do not appear where they should be.

  • Late-binding implementation, which enables better compliance with authoritative definitions (EPSG) for choosing Bursa-Wolf parameters. For some pairs of CRS, the use of WGS84 as a pivotal system in early-binding implementations can only introduce errors (e.g. 1 metre in Martinique).

  • Domain of validity information. This reduce the risk that users apply the wrong transformation for a given pair of CRS (for example using in Canada the NAD27 to WGS84 transformations designed for USA).

  • Expected accuracy information. This avoid luring the users in thinking that a coordinate operation is accurate up to rounding errors while actually it may have a stochastic 20 metres error.

  • More robustness to high eccentricities. The is beneficial mostly to planetary CRS, but also improves Cylindrical Equal Area inverse projection accuracy as a side-effect.

At the time of writing this blog, work is underway for improving Proj.4 with a new API. But I do not know which of above issues would be addressed. Changing the current Proj.4 early-binding approach into a late-binding one would be a significant architectural change. Whether such improvement is needed or not depends on the desired accuracy. Unfortunately, there is nothing in current implementation telling to user if Proj.4 is delivering the accuracy that (s)he need. Ideally, this lack of critical information could be fixed by implementing Well Known Text (WKT) version 2 (ISO 19162) in Proj.4. The following points describe a possible path:

  • Move the WKT parsing and formatting code from GDAL to Proj.4.
  • Provide WKT 2 support for projPJ data structure:
    • Define some WKT 2 keywords as aliases of WKT 1 keywords (e.g. “PROJCRS” as alias of “PROJCS”).
    • When a WKT 2 keyword is detected, raise a flag changing the way GDAL/Pro.4 applies units of measurement. While the WKT 2 standard defined by ISO 19162 is designed for backward compatibility with WKT 1 as defined in OGC 01-009 standard, the way that GDAL/Proj.4 currently handles units of measurement other than degrees and meters is not conform to that WKT 1 standard. This is unlikely to change for GDAL compatibility reasons, but this error shall not be propagated to WKT 2 parser. This would require GDAL/Proj.4 to implement a WKT parser working in two different modes depending on which keywords is found in the WKT definition.
    • Add support for mandatory WKT 2 elements, in particular AXIS.
  • Provide WKT 2 support for pj_transform(projPJ, projPJ, …) data structure (needs new Proj.4 API):
    • Include the Bursa-Wolf parameters (or any other parameters) effectively used.
    • Include operation domain of validity and expected accuracy. Those metadata are better described with late-binding implementations, but an early-binding implementation can still give some idea for example by computing the intersection of the domain of validity of source and target CRS.

Since Apache SIS is already there (and beyond), it could be used for inter-operability experiments and validation during Proj.4 developments. The Java Native Interface (JNI) bindings in the sis-gdal module should make this process easy.