Build Geolambda with GEE

Introduction

Geolambda is a docker-based lambda function package builder. If we replace GDAL installation in Docker with GEE, we can build a lambda function that can run on AWS.

# GDAL RUN \ wget http://download.osgeo.org/gdal/$GDAL_VERSION/gdal-$GDAL_VERSION.tar.gz; \ tar -xzvf gdal-$GDAL_VERSION.tar.gz; \ cd gdal-$GDAL_VERSION;

The base of Geolambda is specified in Dockerfile.base:

FROM lambdalinux/baseimage-amzn:2017.03-004

Problem

docker-compose hangs during geos package compilaton on Mojave

libtool: compile: g++ -DHAVE_CONFIG_H -I. -I../../../include -I../../../include/geos -I../../../include -DGEOS_INLINE -pedantic -Wall -ansi -Wno-long-long -ffloat-store -g -O2 -MT MonotoneChainSelectAction.lo -MD -MP -MF .deps/MonotoneChainSelectAction.Tpo -c MonotoneChainSelectAction.cpp -fPIC -DPIC -o .libs/MonotoneChainSelectAction.o

You can restart docker app if that happens. At the end, GDAL build fails with an error that HDF5 is missing.

Checking for H5Fopen in -lhdf5... no configure: error: HDF5 support requested with arg /usr/local, but no hdf5 lib found GNUmakefile:1: GDALmake.opt: No such file or directory ./config.status --recheck make: ./config.status: Command not found make: *** [config.status] Error 127 GNUmakefile:1: GDALmake.opt: No such file or directory ./config.status --recheck make: ./config.status: Command not found make: *** [config.status] Error 127

The above error can be fixed by adjusting HDF5 Library version to 1.10.4 in Dockerfile

[root@befa35da0698 bin]# ./lambda-package.sh ./lambda-package.sh Creating deploy package for Python 2.7 cp: cannot stat /usr/local/lib/libhdf5.so.101: No such file or directory

The lambda package created by Geolambda works well. However, /vsicurl doesn't work for CERES HDF file with gdal.Open().

Runtime failure is likely ! ERROR 4: `/vsicurl/https://gamma.hdfgroup.org/ftp/pub/outgoing/NASAHDF/CER_ES4_TRMM-PFM_Edition2_019018.19808.hdf' not recognized as a supported file format.

HDF5 Open Failure for GDAL version 2.3.1

GDAL 2.3.1 can't open MOP03T. It seems that GDAL 2.4.0 is necessary for vsicurl.

Error detected in HDF5 (1.10.4) thread 0:\n #0\ 00: H5F.c line 509 in H5Fopen(): unable to open file\n major: File accessibilty\\ n minor: Unable to open file\n #001: H5Fint.c line 1400 in H5F__open(): unable \ to open file\n major: File accessibilty\n minor: Unable to open file\n #002:\ H5Fint.c line 1546 in H5F_open(): unable to open file: time = Fri Jan 18 02:59:01 \ 2019\n, name = '/vsicurl/https://gamma.hdfgroup.org/ftp/pub/outgoing/NASAHDF/MOP03T\ -20131129-L3V5.2.1.he5', tent_flags = 0\n major: File accessibilty\n minor: U\ nable to open file\n #003: H5FD.c line 734 in H5FD_open(): open failed\n major:\ Virtual File Layer\n minor: Unable to initialize object\n #004: H5FDsec2.c lin\ e 346 in H5FD_sec2_open(): unable to open file: name = '/vsicurl/https://gamma.hdfg\ roup.org/ftp/pub/outgoing/NASAHDF/MOP03T-20131129-L3V5.2.1.he5'

Identifying Band from HDF5

Warning 1: GDAL was built against curl 7.53.1, but is running against 7.51.0. Runti\ me failure is likely ! 'NoneType' object has no attribute 'GetStatistics': AttributeError Traceback (most recent call last): File "/var/task/lambda.py", line 26, in handler stats = band.GetStatistics(0, 1) AttributeError: 'NoneType' object has no attribute 'GetStatistics'

/vsis3 looks up s3.amazonaws.com

If HDF5:%22%2Fvsis3%2Ftest%2FMOP03T-20131129-L3V5.2.1.he5%22:%2F%2FHDFEOS%2FGRIDS%2FMOP03%2FData_Fields%2FRetrievedSurfaceTemperatureDay is submitted, /vsis3 looks up amazonaws.com, not LocalStack S3. HTTP response code on https://test.s3.amazonaws.com/MOP03T-20131129-L3V5.2.1.he5: 403 HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0: #000: H5F.c line 509 in H5Fopen(): unable to open file

The above behavior can be corrected by setting AWS_S3_ENDPOINT but GDAL always uses https:// for data retrieval. https:// cannot be used with LocalStack by default. Define USE_SSL=true when you start LocalStack. However, SSL is not supported by mockit [5].

Python gdal.Open(opendap_url)

/vsimem is invoked on downloaded file.

#002: H5Fint.c line 1546 in H5F_open(): unable to open file: time = Tue Feb 5 04:45:52 2019 , name = '/vsimem/http_1/grid_1_2d.h5', tent_flags = 0

This is due to DODS driver is missing [6].

Background

docker-compose.yml specifies two services

  • base
  • core
  • base doesn't include gdal. core is our main target. Although geolambda's README says docker-compose run base, we should run core.

    $docker-compose run core

    GDAL provides AWS_S3_ENDPOINT option.

    gdal.SetConfigOption(b'AWS_S3_ENDPOINT', AWS_S3_ENDPOINT.encode()) gdal.SetConfigOption(b'AWS_ACCESS_KEY_ID', AWS_ACCESS_KEY_ID.encode()) gdal.SetConfigOption(b'AWS_SECRET_ACCESS_KEY', AWS_SECRET_ACCESS_KEY.encode()) gdal.SetConfigOption(b'CPL_CURL_VERBOSE', b'YES') gdal.SetConfigOption(b'AWS_VIRTUAL_HOSTING', b'NO')

    Use the following command to generate sessoin token:

    $aws sts get-session-token --duration-seconds 129600

    Solution

  • Modify Dockerfile: HDF5 version to 1.10.4. GDAL version to 2.4.0.
  • Modify lambda-package-base.sh: libhdf5.so.101 to 103
  • docker-compose build (--no-cahce)
  • docker-compose run core
  • sftp lambda-deploy.zip to host system (e.g. #sftp hyoklee@nene) to save the lambda function.
  • Unzip and add your lambda.py. Zip it again.
  • Deploy to local stack.
  • Check lambda functions executed: $docker ps -a
  • Check output logs: $docker logs [container_id]
  • Experiment

    Geolambda's GDAL image could successfully run on LocalStack for a GEE sample GeoTIFF file https://gamma.hdfgroup.org/ftp/pub/outgoing/joe/gee/MOP03T.45.tif. /viscurl worked fine with gdal.Open().

    Warning 1: GDAL was built against curl 7.53.1, but is running against 7.51.0. Runtime failure is likely ! [-9999.0, 444.1781311035156, -9000.95712966157, 3016.201309916306] [DEBUG] 2019-01-17T03:38:47.852Z 228987c5-4707-4171-b0c5-da5e55b6f4fd [-9999.0, 444.1781311035156, -9000.95712966157, 3016.201309916306] END RequestId: 228987c5-4707-4171-b0c5-da5e55b6f4fd REPORT RequestId: 228987c5-4707-4171-b0c5-da5e55b6f4fd Duration: 529 ms Billed Duration: 600 ms Memory Size: 1536 MB Max Memory Used: 41 MB {"body": [-9999.0, 444.1781311035156, -9000.95712966157, 3016.201309916306], "statusCode": 200}