Category

GIS

Using Airbus OneAtlas’s Imagery for Change Detection in the Permian Basin

By GISNo Comments

Introduction

As oil and gas production continues to ramp up in the Texas Permian Basin, advances in the processing and use of satellite imagery can be leveraged to identify when and where  oil and gas production facilities are changing. Additional details regarding the activities at the identified locations can also be determined through the analysis of satellite images. Change detection techniques for satellite images can provide increased insight into the capacity, functional state, and impact of oil and gas production projects, as well as serve as inputs into predictive models for more accurate output generation.

Objective

Develop a basic methodology to automatically determine where oil extraction infrastructure has been altered using SPOT Imagery from AirBus’s new and exciting imagery offering from OneAtlas.

OneAtlas – SPOT Imagery

For the purpose of developing some basic unsupervised change detection models, Bytelion made use of the OneAtlas project from Airbus. As part of their OneAtlas Sandbox initiative, Airbus provided both SPOT and Pleiades imagery sets for two coverage areas: The Permian Basin, TX and Amenas, Algeria. For the purposes of this initiative, we focused on using the SPOT Imagery. The following SPOT Image sets were provided:

  • SPOT6_201310201705565.zip (Permian Basin – Odessa, TX – 2013)
  • SPOT7_201507261703160.zip (Permian Basin – Odessa, TX – 2015)
  • SPOT7_201508270943305.zip (Amenas, Algeria – 2015)
  • SPOT7_201603100937274.zip (Amenas, Algeria – 2016)

Multiple image files are included within each zip archive. As the data is high resolution and rich, the zip archives that we used were up to 10-15GB in size.

 

Unpacking the SPOT Imagery

To begin developing change detection capabilities, we first needed to unzip the imagery packages and load the images into a GIS software application. All images within each of the zip archives are orthorectified and include three spectral bands (red, green, blue) at 8 bit resolution.

Each of the SPOT imagery datasets consists of a nested directory structure which contains multiple files. Within the directory tree are multiple files and levels which include spectral masks, metadata files, processing descriptions, image files, and projection files. The primary image file is in JPEG2000 format located at the IMG_<Product_ID> Level.    

 

Unsupervised Change Detection Methodology

QGIS Desktop, an open-source GIS Software Package, was used to perform a change detection analysis using the SPOT imagery for the Permian Basin area. The 2 SPOT Images were brought into a QGIS project in order to perform a visual comparison. By overlaying the latest imagery on top of the previous imagery, a visual scan and comparison effort identified areas of change which could be used to validate the automated methodology.

The above image depicts an area where the imagery has changed between 2013 and 2015.

As the extent of the SPOT images is extremely large, only a subset of the imagery was needed to perform the initial proof of concept analysis. Using the QGIS “Clip raster by extent” tool, a smaller image was clipped from each of the source image files. The clipped files were saved as GeoTIFFs and added to the project in QGIS.

Raster arithmetics

Within QGIS, the Raster Calculator can be used to perform a mathematical comparison between the 2 subset images. The following Python code demonstrates the process:

1 – Import Libraries
from PyQt5.QtGui import *
from PyQt5.QtCore import *
from qgis.analysis import *

2 – Set pathnames for raster images
early = “D:/OneAtlas/qgis1/clip2013.tif”
later =”D:/OneAtlas/qgis1/clip2015.tif”
earlyName = “Early”
laterName = “Later”

3 – Set images as raster layers
earlyRaster = QgsRasterLayer(early, earlyName)
laterRaster = QgsRasterLayer(later, laterName)

4 – Build the Raster Calculator Entries
earlyEntry = QgsRasterCalculatorEntry()
laterEntry = QgsRasterCalculatorEntry()
earlyEntry.Raster = earlyRaster
laterEntry.Raster = laterRaster
earlyEntry.bandNumber = 1
laterEntry.bandNumber = 2
earlyEntry.ref = earlyName + “@1”
laterEntry.ref = laterName + “@2”
entries = [laterEntry, earlyEntry]

5 – Generate the expression for the Raster Calculator
exp = “%s – %s” % (laterEntry.ref, earlyEntry.ref)

6 – Set up the output file properties as GeoTIFF
output = “D:/OneAtlas/change_spot.tif”
rasterExtent = earlyRaster.extent()
rasterWidth = earlyRaster.width()
rasterHeight = earlyRaster.height()

7 – Perform the calculation in the Raster Calculator
changeAreas = QgsRasterCalculator(exp, output, “GTiff”, rasterExtent, rasterWidth, rasterHeight, entries)
changeAreas.processCalculation()

Once the processing is complete the output GeoTIFF image is added to the project. Rendering the output image as a Singleband Pseudocolor that includes an equal interval classification scheme (In this case 5 intervals that reflect a degree of change) yields a grayscale image where the larger spectral differences are depicted as white, while the areas of no or minor change are depicted in black.

Binary representation of the output change image. White pixels represent areas of more extreme change.

Greyscale rendering of the output change image.

The output change raster image (center) identifies an area of change between the 2013 SPOT image (left) and the 2015 SPOT image (right)

Summary

By calculating the spectral differences between the SPOT Images provided by Airbus OneAtlas, a basic change detection raster image is generated that can be used to identify areas of interest where landscape features have changed.  From the output raster, the pixels that depict more extreme change are Areas Of Interest that can be further researched to determine the extent and scope of landscape change.

What is next? 

In the next phase, we might be exploring the identified areas to determine what changed between the images and if any patterns or signatures can be automatically recognized and incorporated into machine learning models.  Alternatively, we are considering how we can tackle all of the oil fields and not just this simple test.  Would love to hear from you if you had some other ideas.  Please let us know!

    Your Name (required)

    Your Email (required)

    Subject

    Your Message

    GIS Crawling Done Easy Using Python With PostGreSQL

    By AWS, Databases, Development, GIS, Python, RDSNo Comments

    Problem

    Company “Help-i-copter” allows users to rent a ride in a helicopter for a lift in style. However, Help-i-copter is having trouble creating software that find the best flat surface for the helicopter pilots to land to pick up their customers. A perfect solution would be crawling data from sites and to store it on a PostGreSQL database. What steps should they take?

    Crawling

    Crawling, in software terms, is the act of copying data from a website using a computer program. This data can be saved to a file, printed to your screen, or put into a database. It entirely depends on the project and its scale. On a much greater scale, where data can be overwhelming, a database would be the best option, which is why it’s important to have one. You could have your grocery list stored online, but you don’t want to login every time to see it. You could instead crawl that information off the web with a command like:

    soup.find_all('a', attrs={'class': ‘list'})

    • Write a crawling program that collects data from multiple sources, to get the most accurate data. This program would most likely be written in Python as it has access to modules like Requests and BeautifulSoup.

    import requests
    from bs4 import BeautifulSoup
    url = "http://www.placeforyourhelicopter.com"
    r = requests.get(url)
    soup = BeautifulSoup(r.content, "html.parser")
    surfaces = soup.find_all('a', attrs={'class': 'surface'})
    for d in surfaces:
    #getting all the surface information

    • Store the found data on a PostGreSQL database. This allows the data to be stored and sorted. When it’s sorted, this allows the data to be queried quickly and efficiently.

    import psycopg2
    endpoint = "surfaces.amazonaws.com"
    c = psycopg2.connect(host=endpoint ,database="surfaces", user="admin", password="admin")
    cursor = c.cursor()
    query = "INSERT INTO surfaces (airtraffic,windspeed,date_produced,lat,lon,state) VALUES (%s,%s,%s,%s,%s,%s);"
    data = #all info from surfaces in an array
    cursor.execute(query, data)

    • Create an app that allows pilots to enter in a variable they want to query and parameters for that variable. Then use Python and the Psycopg2 module to query the data accordingly. This app with allow non-programmers to access the database without having to learn PostGreSQL.

    letter = raw_input("Choose a number: ")
    cursor.execute("SELECT * FROM surfaces WHERE airtraffic LIKE " + letter)
    for row in cursor:
    print str(row) + "\n"

     

    Databases

    So why is it important to store large amounts of data on a database? Simply, it gives you an easy way to find and query data. For example, let’s say you have a list of employees you got from your company’s website, and added it to a PostGreSQL database. In such a database, finding data like “all employees with a first name that begins with an L” would be much simpler, as databases are well organized. Simple commands like:

    where employee_name LIKE 'L%'

    would return “Larry, Liz, and Lucy” quickly and efficiently.

    Bounding Box

     

    This data could be used in a lot of ways. Namely, we could use out latitude and longitude coordinates to create a bounding box and get information about the areas within that box. This could help Help-i-copter in a number of ways. Pilots could use a bounding box to find flat areas near them and sort those by air-traffic, location, etc… It will be essentially asking the pilot the maximum and minimum coordinates of the box and then looking through the database for any surface that fit the description. Here’s what that might look like in Python:

    xmin = raw_input("What is the minimum latitude? ")
    xmax = raw_input("What is the maximum latitude? ")
    ymin = raw_input("What is the minimum longitude? ")
    ymax = raw_input("What is the maximum longitude? ")
    cursor.execute("SELECT * FROM surfaces WHERE GEOMETRY(lat,lon) && ST_MakeEnvelope("
    + ymin + "," + ymin + "," + xmax + "," + ymax + ", 4326)")
    for row in cursor:
    print str(row) + "\n"

    Conclusion

     

    As you can see, crawling and databases can work very well together, especially when crawling large amounts of data. It would be otherwise inefficient and a lot slower to just store data on a normal document, or to do a crawl of the page every time you run the program. Help-i-copter able to efficiently crawl a site, upload data to a web page, and query that data back to a pilot quickly. Thanks to the power of PostGreSQL and Python.

    Enabling PostGIS on PostgreSQL with Amazon Web Services(AWS) | Relational Database Service(RDS)

    By AWS, Databases, GIS

    Part 1 of our PostGIS series is to help developers enable PostGIS for a PostgreSQL database in AWS.

    Amazon’s Relational Database Service (RDS) allows users to launch PostgreSQL databases easily and quickly. PostgreSQL allows developers to extend the functionality of the core database via extensions.  For this reason, extensions loaded into the database can function just like features that are built in.

    Because this database is on Amazon’s servers, they have limitations on what extensions the will allow to be hosted on their service.  These limitations are mostly for performance reasons.

    PostGIS Version

    By default, the database has PostGIS installed on it. This isn’t critical to run, but it may be helpful in your troubleshooting efforts.

    To get the current version, use this:

    SELECT PostGIS_full_version();

    Result:

    POSTGIS=”2.3.2 r15302″ GEOS=”3.5.1-CAPI-1.9.1 r4246″ PROJ=”Rel. 4.9.3, 15 August 2016″ GDAL=”GDAL 2.1.3, released 2017/20/01″ LIBXML=”2.9.1″ LIBJSON=”0.12″ RASTER

    Normal Installation Guidance

    If you follow PostGIS’s guidance here, you will get the following error.

    When you run this command:
    CREATE EXTENSION postgis_sfcgal;

    ERROR: Extension “postgis_sfcgal” is not supported by Amazon RDS
    DETAIL: Installing the extension “postgis_sfcgal” failed, because it is not on the list of extensions supported by Amazon RDS.
    HINT: Amazon RDS allows users with rds_superuser role to install supported extensions. See: SHOW rds.extensions;

    AWS RDS PosGIS Installation Guidance

    AWS covers enabling PosGIS via their common database administrator (DBA) tasks here.

    In about 3 minutes, you can run through these commands and enable PosGIS tables in your database.

    AWS Supported Extensions

    This is a pretty nifty command to see what extensions are displayed on the device.

    SHOW rds.extensions;

    Here is a list of the Posgresql Extentions that RDS does support:

    address_standardizer
    address_standardizer_data_us
    bloom
    btree_gin
    btree_gist
    chkpass
    citext
    cube
    dblink
    dict_int
    dict_xsyn
    earthdistance
    fuzzystrmatch
    hll
    hstore

    hstore_plperl
    intagg
    intarray

    ip4r
    isn
    log_fdw
    ltree
    pgaudit

    pgcrypto
    pgrouting
    pgrowlocks
    pgstattuple
    pg_buffercache
    pg_freespacemap
    pg_hint_plan
    pg_prewarm
    pg_repack
    pg_stat_statements
    pg_trgm
    pg_visibility
    plcoffee

    plls
    plperl
    plpgsql

    pltcl
    plv8
    postgis
    postgis_tiger_geocoder
    postgis_topology
    postgres_fdw
    sslinfo
    tablefunc
    test_parser
    tsearch2
    tsm_system_rows
    tsm_system_time
    unaccent
    uuid-ossp