|
<< Click to Display Table of Contents >> Navigation: Input/Output > Load gather from S3 |
Load gather from S3 reads a seismic gather dataset previously saved to Amazon S3 cloud storage and brings it into the g-Platform processing flow. The module reads a metadata file stored alongside the data to determine the gather dimensions (trace count, sample count, sample interval), then downloads all gather tiles in parallel using multiple read threads. The reconstructed gather is assembled in memory and passed to downstream processing modules as a standard gather output.
This module is the counterpart to Save gather to S3 and is designed for cloud-based processing workflows where large seismic datasets are stored in S3 buckets. Authentication is managed through named credential profiles defined in the local credentials configuration file.
This module does not require a seismic data input connection. All data is read directly from Amazon S3 using the path and credentials specified in the parameters.
The S3 path (key prefix) identifying the gather dataset to load. This should match the gather name used when the data was written by the Save gather to S3 module. The path must point to the root location of the gather in the S3 bucket, where the metadata file and data tiles are stored.
The name of the AWS credentials profile to use for authenticating with Amazon S3. Profiles are defined in the local S3 credentials configuration file (INI format). The dropdown list is populated automatically from the available profiles in that file. Select the profile that provides access to the S3 bucket containing the gather data.
The number of parallel threads used to download gather tiles from S3. Default: 5. Increasing this value can significantly reduce loading time for large gathers with many tiles, provided the network connection and S3 service can sustain the concurrent requests. Valid range: 1 to 1000.
Selects whether processing runs on the CPU or GPU. For this module, which performs network I/O rather than heavy computation, CPU execution is standard.
Controls whether the module runs on a remote processing node in a distributed cluster environment. When enabled, the job is submitted to a remote node rather than executed locally.
The minimum number of gathers processed per execution chunk in distributed mode. Larger values reduce scheduling overhead but require more memory per node.
When distributed execution is active, this setting caps the number of threads that each remote node is allowed to use, preventing overloading of shared cluster resources.
An optional text label appended to the distributed job name. Use this to distinguish multiple simultaneous jobs running on the cluster.
When enabled, allows the user to manually specify which CPU cores or NUMA nodes this module is allowed to use. Leave disabled to let the system assign resources automatically.
The specific CPU core or NUMA node affinity mask, active only when Set custom affinity is enabled.
The number of CPU threads to use for local execution. Higher values can improve throughput when processing many gathers sequentially.
When enabled, this module is bypassed and execution continues with the next module in the flow. Use this setting to temporarily disable the module without removing it from the workflow.
The primary data container passed to the next module in the sequence, carrying all associated seismic data items loaded from S3.
A handle to the seismic data reader, enabling downstream modules to access trace data on demand.
The collection of trace headers describing all traces in the loaded gather, including geometry and sorting information.
The seismic gather loaded from S3, assembled from all downloaded tiles. This gather is available for connection to any downstream processing or display module.
The 2D stack line geometry associated with the loaded dataset, if present.
The crooked 2D line geometry associated with the loaded dataset, if present.
The 3D bin grid geometry associated with the loaded dataset, used for inline/crossline address resolution, if present.
An indexed and sorted version of the trace headers, used by downstream modules that require ordered access to traces by gather key (e.g., by CDP or offset).