4.2.3. Interpreting Query Job Information

This section will provide some information on how to interpret query job information.

Google maintains documentation about BigQuery job information at the links below.

The figure below depicts how contiguous and overlapping job stage intervals can be grouped into islands, where there is constantly one or more slots allocated to the job stage. And the intervals between islands form gaps, where there are no slots allocated. Stages 0 and 1 are contiguous, and stage 2 overlaps with stage 1; stages 0, 1, and 2 form an island. Stage 3 executes alone and forms its own island. Between stages island 1 and stages island 2 there is a gap. The motivation for determining the islands and gaps is to determine when one or more slots are allocated (good) and when no slots are allocated (bad). Note that terms used in this figure are described below.

    -- stage 0 startTime ----------------------------
    |                                              |
    |  stage 0 interval                            |  stages island 1
    |                                              |
    -- stage 0 endTime -------- stage 1 startTime  |
                             |                     |
    -- stage 2 startTime     |  stage 1 interval   |
    |                        |                     |
    |  stage 2 interval      -- stage 1 endTime    |
    |                                              |
    -- stage 2 endTime ------------------------------
    |
    |  stages gap 1
    |
    -- stage 3 startTime  --
    |                     |
    |  stage 3 interval   |  stages island 2
    |                     |
    -- stage 3 endTime------
    |
    | time
    V
    

The figure below depicts times and intervals in the life cycle of a query job. Note that terms used in this figure are described below.

    -- job creationTime ----------------------------------
    |                                                   |
    |  PENDING                                          |  in-flight
    |                                                   |
    -- job startTime -------------------------          |
    |                                       |           |
    |  unstaged                             |  RUNNING  |
    |                                       |           |
    -- stages_start_time ----------         |           |
    |                            |          |           |
    |  staged island (active)    |  staged  |           |
    |                            |          |           |
    -- staged island end time    |          |           |
    |                            |          |           |
    |  staged gap (inactive)     |          |           |
    |                            |          |           |
    -- staged island start time  |          |           |
    |                            |          |           |
    |  staged island (active)    |          |           |
    |                            |          |           |
    -- stages_end_time ------------         |           |
    |                                       |           |
    |  ending                               |           |
    |                                       |           |
    -- job endTime ---------------------------------------
    |
    |  DONE
    |
    |
    | time
    V
    

The terms used in the figures above are described below. Some of the terms are defined in Google documentation, and some are defined only in this guide. Google terms which are field names tend to be written in lower camel case in the REST API (e.g., field creationTime) and in snake case in INFORMATION_SCHEMA tables (e.g., field creation_time). BigQuery tends to natively record times as integer milliseconds since the UNIX epoch, but it may be converted to a timestamp.

creationTime

Or creation_time. Google field that indicates when BigQuery first registers a job due to a job insert request. Note that a job insert request that is rejected because it would violate the project's job concurrency limit is not recorded as part of the standard jobs information.

in-flight

Interval between creationTime and job endTime. Google field state has value PENDING or RUNNING during this interval, not DONE.

Note that some query job information which should logically be available while a query job is in-flight may not actually available while a query job is in-flight. That information is only available once a query job is DONE.

Below are some job stage attributes and metrics that may be misleading while a query job is in-flight.

  • name

    When a stage status is COMPLETE, name will indicate something about the operations performed in the stage (e.g., S00: Input for a first stage, which reads from a table). However, while a stage status is RUNNING, name will temporarily indicate an Output operation (e.g., S00: Output).

  • startMs

    This field (and any field derived from it) is null while the query job is in-flight.

  • endMs

    This field (and any field derived from it) is null while the query job is in-flight.

  • computeMsAvg, computeMsMax

    These fields may be overstated while the query job is in-flight.

  • waitMsAvg, waitMsMax

    These fields will be 0 while the query job is in-flight, but set to their true value when the query job is DONE.

PENDING

Google term for interval between creationTime and job startTime. Google field state has value PENDING during this interval.

job startTime

Or start_time. Google field that indicates when BigQuery first attempts to start work to generate query results.

RUNNING

Google term for interval between startTime and job endTime. Google field state has value RUNNING during this interval.

unstaged

Interval between startTime and stages_start_time. No slots are assigned to do work for the query job during this interval.

stages_start_time

Computed field that indicates the earliest start time of any stage. E.g., min(start_ms) from unnest(job_stages).

staged island (active)

One or more intervals during which one or more contiguous or overlapping stages executed. It is only during these intervals that slots are assigned to do work.

The start of an interval is one of:

  • stages_start_time

  • stages gap start time

The end of an interval is one of:

  • stages gap end time

  • stages_end_time

stages gap start time

Zero, one, or more computed times indicating the start of a stages gap, or equivalently, the end of a first or intermediate stages island. Computed as the maximum stage endMs of a stages island, which is composed of one or more contiguous or overlapping stages.

staged gap (inactive)

Zero, one, or more intervals during which there is a gap between stage islands. So, no stages occur during this interval, but there is at least one prior interval with one or more stages. And more stages are required to complete the job.

The start of an interval is a staged island end time.

The end of an interval is a staged island start time

stages gap end time

Zero, one, or more computed times indicating the end of a stages gap, or equivalently, the start of an intermediate or final stages island. Computed as the minimum stage startMs of a stages island, which is composed of one or more contiguous or overlapping stages.

stages_end_time

Computed field that indicates the last end time of any stage: max(end_ms) from unnest(job_stages).

ending

Interval between stages_end_time and job endTime.

job endTime

Or end_time. Google field that indicates the end of the job. At this point, the destination table, if any, should be populated and its data available for a table list operation.

DONE

Google term for final state when the query job has completed, after endTime. Google field state has value DONE now and hereafter.