4.2.2. Query Job Information Via Command Line

Using the command line to gather information about a query job can be more cumbersome than using SQL. However, by using the command line, you are not limited to the fields that Google has chosen to make available through the INFORMATION_SCHEMA jobs views. Moreover, you are not dependent on BigQuery slots to get information. So, using the command line could be particularly valuable for getting timely information about in-flight queries.

Instructions for getting information about a query using the command line:

  1. Start a shell with access to required commands.

    If you are expert, then you can attempt to do this on a local machine. Alternatively, to use Google Cloud Shell, see https://cloud.google.com/shell/docs/launching-cloud-shell and following sections of the Google Cloud Shell documentation. Cloud Shell is available at https://console.cloud.google.com/home/dashboard?cloudshell=true.

  2. Set project ID.

    Execute the command below, replacing PROJECT_ID with the billing project ID of the query job.

    gcloud \
        config \
        set \
        project PROJECT_ID
  3. Find query project ID and job ID.

    For example, execute the command below, modifying conditions as needed. Option --all will include jobs for all users in the project, assuming you have appropriate permissions. Option --max_rows_per_request 1000 will limit the number of rows returned by the bq command to 1000. Options --min_creation_time "$(date -d 'now - 180 minutes' -u '+%s%3N')" and --max_creation_time "$(date -d 'now' -u '+%s%3N')" will limit the jobs to just those created between 180 minutes ago and now. Code (.user_email == "user@example.com") will only include jobs created by that particular user. Code (.status.state != "DONE") will exclude DONE jobs. If filtering on user_email, specify in all lowercase.

    bq \
        --format prettyjson \
        ls \
        --jobs \
        --all \
        --max_rows_per_request 1000 \
        --min_creation_time "$(date -d 'now - 180 minutes' -u '+%s%3N')" \
        --max_creation_time "$(date -d 'now' -u '+%s%3N')" |
    jq \
        --compact-output \
        '
        .[] |
        select(
            (.user_email == "user@example.com")
            and (.status.state != "DONE")
        )
        | {
            "jobId": .jobReference.jobId,
            "state": .status.state
        }
        '
  4. Capture job information.

    Execute the command below, replacing JOB_ID as needed.

    bq \
        --format prettyjson \
        show -j \
        JOB_ID \
        >job.json
  5. Create job extraction jq program.

    If it does not already exist, create file $HOME/job.jq. Execute the command below.

    cat <<EOF >"$HOME/job.jq"
    def nullable(f):
        if . == null
        then
            null
        else
            . | f
        end;
    
    def pair_nullable(f):
        if (.[0] == null) or (.[1] == null)
        then
            null
        else
            . | f
        end;
    
    [
        {
            creation_date:
                .statistics.creationTime |
                nullable(
                    tonumber /
                    1000 |
                    strftime("%Y-%m-%d")
                ),
            project_id: .jobReference.projectId,
            job_id: .jobReference.jobId,
            reservation_id: .statistics.reservationId,
            job_type: .jobType,
            statement_type: .statistics.query.statementType,
            cache_hit: .statistics.query.cacheHit,
            priority: .configuration.query.priority,
            state: .status.state,
            parent_job_id: .statistics.parentJobId,
            num_child_jobs: .statistics.numChildJobs,
            error_reason: .status.errorResult.reason,
            error_message: .status.errorResult.message,
            user_email: .user_email,
            stages_running_count:
                .statistics.query.queryPlan |
                nullable(
                    map(
                        select(.status == "RUNNING")
                    ) |
                    length
                ),
            stages_complete_count:
                .statistics.query.queryPlan |
                nullable(
                    map(
                        select(.status == "COMPLETE")
                    ) |
                    length
                ),
            stages_count:
                .statistics.query.queryPlan | length,
            max_stages_elapsed_ms:
                .statistics.query.queryPlan |
                nullable(
                    map(
                        [
                            (.endMs | nullable(tonumber)),
                            (.startMs | nullable(tonumber))
                        ] |
                        pair_nullable(.[0] - .[1])
                    ) |
                    max
                ),
            max_wait_ms_max:
                .statistics.query.queryPlan |
                nullable(
                    map(
                        .waitMsMax |
                        nullable(tonumber)
                    ) |
                    max
                ),
            timeline_pending_units:
                .statistics.query.timeline[-1].pendingUnits,
            timeline_active_units:
                .statistics.query.timeline[-1].activeUnits,
            timeline_completed_units:
                .statistics.query.timeline[-1].completedUnits,
            timeline_total_slot_ms:
                .statistics.query.timeline[-1].totalSlotMs,
            total_slot_ms: .statistics.totalSlotMs,
            total_bytes_processed: .statistics.totalBytesProcessed,
            creation_time:
                .statistics.creationTime |
                nullable(
                    tonumber /
                    1000 |
                    strftime("%Y-%m-%d %H:%M:%S")
                ),
            start_time:
                .statistics.startTime |
                nullable(
                    tonumber /
                    1000 |
                    strftime("%Y-%m-%d %H:%M:%S")
                ),
            stages_start_time:
                .statistics.query.queryPlan |
                nullable(
                    map(
                        .startMs |
                        nullable(tonumber) |
                        select(. != null)
                    ) |
                    min |
                    nullable(
                        . /
                        1000 |
                        strftime("%Y-%m-%d %H:%M:%S")
                    )
                ),
            stages_end_time:
                .statistics.query.queryPlan |
                nullable(
                    map(
                        .startMs |
                        nullable(tonumber)
                    ) |
                    max |
                    nullable(
                        . /
                        1000 |
                        strftime("%Y-%m-%d %H:%M:%S")
                    )
                ),
            timeline_end_time:
                [
                    (
                        .statistics.startTime |
                        nullable(tonumber)
                    ),
                    (
                        .statistics.query.timeline |
                        nullable(
                            map(
                                .elapsedMs |
                                nullable(tonumber)
                            ) |
                            max
                        )
                    )
                ] |
                pair_nullable(
                    .[0] + .[1]
                ) |
                nullable(
                    . /
                    1000 |
                    strftime("%Y-%m-%d %H:%M:%S")
                ),
            end_time:
                .statistics.endTime |
                nullable(
                    tonumber /
                    1000 |
                    strftime("%Y-%m-%d %H:%M:%S")
                ),
            creation_to_end_ms:
                [
                    (
                        .statistics.endTime |
                        nullable(tonumber)
                    ),
                    (
                        .statistics.creationTime |
                        nullable(tonumber)
                    )
                ] |
                pair_nullable(
                    .[0] - .[1]
                ),
            start_to_end_ms:
                [
                    (
                        .statistics.endTime |
                        nullable(tonumber)
                    ),
                    (
                        .statistics.startTime |
                        nullable(tonumber)
                    )
                ] |
                pair_nullable(
                    .[0] - .[1]
                ),
            pending_ms:
                [
                    (
                        .statistics.startTime |
                        nullable(tonumber)
                    ),
                    (
                        .statistics.creationTime |
                        nullable(tonumber)
                    )
                ] |
                pair_nullable(
                    .[0] - .[1]
                ),
            unstaged_ms:
                [
                    (
                        .statistics.query.queryPlan |
                        nullable(
                            map(
                                .startMs |
                                nullable(tonumber) |
                                select(. != null)
                            ) |
                            min
                        )
                    ),
                    (
                        .statistics.startTime |
                        nullable(tonumber)
                    )
                ] |
                pair_nullable(
                    .[0] - .[1]
                ),
            staged_ms:
                [
                    (
                        .statistics.query.queryPlan |
                        nullable(
                            map(
                                .startMs |
                                nullable(tonumber)
                            ) |
                            max
                        )
                    ),
                    (
                        .statistics.query.queryPlan |
                        nullable(
                            map(
                                .startMs |
                                nullable(tonumber) |
                                select(. != null)
                            ) |
                            min
                        )
                    )
                ] |
                pair_nullable(
                    .[0] - .[1]
                ),
            ending_ms:
                [
                    (
                        .statistics.endTime |
                        nullable(tonumber)
                    ),
                    (
                        .statistics.query.queryPlan |
                        nullable(
                            map(
                                .startMs |
                                nullable(tonumber)
                            ) |
                            max
                        )
                    )
                ] |
                pair_nullable(
                    .[0] - .[1]
                )
        }
    ] |
    (.[0] | keys_unsorted) as \$keys |
    \$keys, map([.[ \$keys[] ]])[] |
    @csv
    EOF
  6. Create job stages extraction jq program.

    If it does not already exist, create file $HOME/job_stages.jq. Execute the command below.

    cat <<EOF >"$HOME/job_stages.jq"
    def nullable(f):
        if . == null
        then
            null
        else
            . | f
        end;
    def pair_nullable(f):
        if (.[0] == null) or (.[1] == null)
        then
            null
        else
            . | f
        end;
    [
        {
            creation_date:
                .statistics.creationTime |
                nullable(
                    tonumber /
                    1000 |
                    strftime("%Y-%m-%d")
                ),
            project_id: .jobReference.projectId,
            job_id: .jobReference.jobId,
        } +
        (
            .statistics.query.queryPlan |
            nullable(
                map(
                    {
                        stage_id: .id,
                        name: .name,
                        status: .status,
                        start_time:
                            .startMs |
                            nullable(
                                tonumber /
                                1000 |
                                strftime("%Y-%m-%d %H:%M:%S")
                            ),
                        end_time:
                            .endMs |
                            nullable(
                                tonumber /
                                1000 |
                                strftime("%Y-%m-%d %H:%M:%S")
                            ),
                        elapsed_ms:
                            [
                                (
                                    .endMs | nullable(tonumber)
                                ),
                                (
                                    .startMs | nullable(tonumber)
                                )
                            ] |
                            pair_nullable(.[0] - .[1]),
                        slot_ms: .slotMs,
                        completed_parallel_inputs: .completedParallelInputs,
                        parallel_inputs: .parallelInputs,
                        completed_parallel_inputs_percent:
                            (
                                if (.parallelInputs | nullable(tonumber)) == 0
                                then
                                    null
                                else
                                    [
                                        (
                                            .completedParallelInputs |
                                            nullable(tonumber)
                                        ),
                                        (
                                            .parallelInputs |
                                            nullable(tonumber)
                                        )
                                    ] |
                                    pair_nullable(.[0] / .[1])
                                end
                            ),
                        wait_ms_avg: .waitMsAvg,
                        wait_ms_max: .waitMsMax,
                        read_ms_avg: .readMsAvg,
                        read_ms_max: .readMsMax,
                        compute_ms_avg: .computeMsAvg,
                        compute_ms_max: .computeMsMax,
                        compute_skew:
                            (
                                if (.computeMsMax | nullable(tonumber)) == 0
                                then
                                    null
                                else
                                    [
                                        (
                                            .computeMsAvg |
                                            nullable(tonumber)
                                        ),
                                        (
                                            .computeMsMax |
                                            nullable(tonumber)
                                        )
                                    ] |
                                    pair_nullable(1 - .[0] / .[1])
                                end
                            ),
                        write_ms_avg: .writeMsAvg,
                        write_ms_max: .writeMsMax,
                        shuffle_output_bytes: .shuffleOutputBytes,
                        shuffle_output_bytes_spilled: .shuffleOutputBytesSpilled,
                        records_read: .recordsRead,
                        records_written: .recordsWritten
                    }
                )
            ) |
            nullable(
                .[]
            )
        )
    ] |
    (.[0] | keys_unsorted) as \$keys |
    \$keys, map([.[ \$keys[] ]])[] |
    @csv
    EOF
  7. Extract the job-level information.

    Execute the command below.

    jq \
        --raw-output \
        --from-file "$HOME/job.jq" \
        job.json \
        >job.csv
  8. Extract the job stage-level information.

    Execute the command below.

    jq \
        --raw-output \
        --from-file "$HOME/job_stages.jq" \
        job.json \
        >job_stages.csv
  9. If needed, copy jobs files to local machine.

    If the files reside on your Google Cloud Shell instance, Google provides instructions for downloading files to your local machine at https://cloud.google.com/shell/docs/uploading-and-downloading-files.