Using the command line to gather information about a query job can be more cumbersome than using SQL. However, by using the command line, you are not limited to the fields that Google has chosen to make available through the INFORMATION_SCHEMA jobs views. Moreover, you are not dependent on BigQuery slots to get information. So, using the command line could be particularly valuable for getting timely information about in-flight queries.
Instructions for getting information about a query using the command line:
Start a shell with access to required commands.
If you are expert, then you can attempt to do this on a local machine. Alternatively, to use Google Cloud Shell, see https://cloud.google.com/shell/docs/launching-cloud-shell and following sections of the Google Cloud Shell documentation. Cloud Shell is available at https://console.cloud.google.com/home/dashboard?cloudshell=true.
Set project ID.
Execute the command below, replacing
PROJECT_ID
with the billing project
ID of the query job.
gcloud \ config \ set \ project PROJECT_ID
Find query project ID and job ID.
For example, execute the command below, modifying conditions as
needed. Option --all
will include jobs for
all users in the project, assuming you have appropriate
permissions. Option --max_rows_per_request
1000
will limit the number of rows returned by the
bq command to 1000. Options
--min_creation_time "$(date -d 'now - 180 minutes' -u
'+%s%3N')"
and --max_creation_time "$(date -d
'now' -u '+%s%3N')"
will limit the jobs to just those
created between 180 minutes ago and now. Code
(.user_email == "user@example.com")
will only
include jobs created by that particular user. Code
(.status.state != "DONE")
will exclude
DONE
jobs. If filtering on user_email, specify in all
lowercase.
bq \ --format prettyjson \ ls \ --jobs \ --all \ --max_rows_per_request 1000 \ --min_creation_time "$(date -d 'now - 180 minutes' -u '+%s%3N')" \ --max_creation_time "$(date -d 'now' -u '+%s%3N')" | jq \ --compact-output \ ' .[] | select( (.user_email == "user@example.com") and (.status.state != "DONE") ) | { "jobId": .jobReference.jobId, "state": .status.state } '
Capture job information.
Execute the command below, replacing
JOB_ID
as needed.
bq \ --format prettyjson \ show -j \ JOB_ID \ >job.json
Create job extraction jq program.
If it does not already exist, create file
$HOME/job.jq
. Execute the command
below.
cat <<EOF >"$HOME/job.jq" def nullable(f): if . == null then null else . | f end; def pair_nullable(f): if (.[0] == null) or (.[1] == null) then null else . | f end; [ { creation_date: .statistics.creationTime | nullable( tonumber / 1000 | strftime("%Y-%m-%d") ), project_id: .jobReference.projectId, job_id: .jobReference.jobId, reservation_id: .statistics.reservationId, job_type: .jobType, statement_type: .statistics.query.statementType, cache_hit: .statistics.query.cacheHit, priority: .configuration.query.priority, state: .status.state, parent_job_id: .statistics.parentJobId, num_child_jobs: .statistics.numChildJobs, error_reason: .status.errorResult.reason, error_message: .status.errorResult.message, user_email: .user_email, stages_running_count: .statistics.query.queryPlan | nullable( map( select(.status == "RUNNING") ) | length ), stages_complete_count: .statistics.query.queryPlan | nullable( map( select(.status == "COMPLETE") ) | length ), stages_count: .statistics.query.queryPlan | length, max_stages_elapsed_ms: .statistics.query.queryPlan | nullable( map( [ (.endMs | nullable(tonumber)), (.startMs | nullable(tonumber)) ] | pair_nullable(.[0] - .[1]) ) | max ), max_wait_ms_max: .statistics.query.queryPlan | nullable( map( .waitMsMax | nullable(tonumber) ) | max ), timeline_pending_units: .statistics.query.timeline[-1].pendingUnits, timeline_active_units: .statistics.query.timeline[-1].activeUnits, timeline_completed_units: .statistics.query.timeline[-1].completedUnits, timeline_total_slot_ms: .statistics.query.timeline[-1].totalSlotMs, total_slot_ms: .statistics.totalSlotMs, total_bytes_processed: .statistics.totalBytesProcessed, creation_time: .statistics.creationTime | nullable( tonumber / 1000 | strftime("%Y-%m-%d %H:%M:%S") ), start_time: .statistics.startTime | nullable( tonumber / 1000 | strftime("%Y-%m-%d %H:%M:%S") ), stages_start_time: .statistics.query.queryPlan | nullable( map( .startMs | nullable(tonumber) | select(. != null) ) | min | nullable( . / 1000 | strftime("%Y-%m-%d %H:%M:%S") ) ), stages_end_time: .statistics.query.queryPlan | nullable( map( .startMs | nullable(tonumber) ) | max | nullable( . / 1000 | strftime("%Y-%m-%d %H:%M:%S") ) ), timeline_end_time: [ ( .statistics.startTime | nullable(tonumber) ), ( .statistics.query.timeline | nullable( map( .elapsedMs | nullable(tonumber) ) | max ) ) ] | pair_nullable( .[0] + .[1] ) | nullable( . / 1000 | strftime("%Y-%m-%d %H:%M:%S") ), end_time: .statistics.endTime | nullable( tonumber / 1000 | strftime("%Y-%m-%d %H:%M:%S") ), creation_to_end_ms: [ ( .statistics.endTime | nullable(tonumber) ), ( .statistics.creationTime | nullable(tonumber) ) ] | pair_nullable( .[0] - .[1] ), start_to_end_ms: [ ( .statistics.endTime | nullable(tonumber) ), ( .statistics.startTime | nullable(tonumber) ) ] | pair_nullable( .[0] - .[1] ), pending_ms: [ ( .statistics.startTime | nullable(tonumber) ), ( .statistics.creationTime | nullable(tonumber) ) ] | pair_nullable( .[0] - .[1] ), unstaged_ms: [ ( .statistics.query.queryPlan | nullable( map( .startMs | nullable(tonumber) | select(. != null) ) | min ) ), ( .statistics.startTime | nullable(tonumber) ) ] | pair_nullable( .[0] - .[1] ), staged_ms: [ ( .statistics.query.queryPlan | nullable( map( .startMs | nullable(tonumber) ) | max ) ), ( .statistics.query.queryPlan | nullable( map( .startMs | nullable(tonumber) | select(. != null) ) | min ) ) ] | pair_nullable( .[0] - .[1] ), ending_ms: [ ( .statistics.endTime | nullable(tonumber) ), ( .statistics.query.queryPlan | nullable( map( .startMs | nullable(tonumber) ) | max ) ) ] | pair_nullable( .[0] - .[1] ) } ] | (.[0] | keys_unsorted) as \$keys | \$keys, map([.[ \$keys[] ]])[] | @csv EOF
Create job stages extraction jq program.
If it does not already exist, create file
$HOME/job_stages.jq
. Execute the command
below.
cat <<EOF >"$HOME/job_stages.jq" def nullable(f): if . == null then null else . | f end; def pair_nullable(f): if (.[0] == null) or (.[1] == null) then null else . | f end; [ { creation_date: .statistics.creationTime | nullable( tonumber / 1000 | strftime("%Y-%m-%d") ), project_id: .jobReference.projectId, job_id: .jobReference.jobId, } + ( .statistics.query.queryPlan | nullable( map( { stage_id: .id, name: .name, status: .status, start_time: .startMs | nullable( tonumber / 1000 | strftime("%Y-%m-%d %H:%M:%S") ), end_time: .endMs | nullable( tonumber / 1000 | strftime("%Y-%m-%d %H:%M:%S") ), elapsed_ms: [ ( .endMs | nullable(tonumber) ), ( .startMs | nullable(tonumber) ) ] | pair_nullable(.[0] - .[1]), slot_ms: .slotMs, completed_parallel_inputs: .completedParallelInputs, parallel_inputs: .parallelInputs, completed_parallel_inputs_percent: ( if (.parallelInputs | nullable(tonumber)) == 0 then null else [ ( .completedParallelInputs | nullable(tonumber) ), ( .parallelInputs | nullable(tonumber) ) ] | pair_nullable(.[0] / .[1]) end ), wait_ms_avg: .waitMsAvg, wait_ms_max: .waitMsMax, read_ms_avg: .readMsAvg, read_ms_max: .readMsMax, compute_ms_avg: .computeMsAvg, compute_ms_max: .computeMsMax, compute_skew: ( if (.computeMsMax | nullable(tonumber)) == 0 then null else [ ( .computeMsAvg | nullable(tonumber) ), ( .computeMsMax | nullable(tonumber) ) ] | pair_nullable(1 - .[0] / .[1]) end ), write_ms_avg: .writeMsAvg, write_ms_max: .writeMsMax, shuffle_output_bytes: .shuffleOutputBytes, shuffle_output_bytes_spilled: .shuffleOutputBytesSpilled, records_read: .recordsRead, records_written: .recordsWritten } ) ) | nullable( .[] ) ) ] | (.[0] | keys_unsorted) as \$keys | \$keys, map([.[ \$keys[] ]])[] | @csv EOF
Extract the job-level information.
Execute the command below.
jq \ --raw-output \ --from-file "$HOME/job.jq" \ job.json \ >job.csv
Extract the job stage-level information.
Execute the command below.
jq \ --raw-output \ --from-file "$HOME/job_stages.jq" \ job.json \ >job_stages.csv
If needed, copy jobs files to local machine.
If the files reside on your Google Cloud Shell instance, Google provides instructions for downloading files to your local machine at https://cloud.google.com/shell/docs/uploading-and-downloading-files.