Below is an example of a query that can be used to analyze a query job at the job stage grain.
Search in the query for “TODO” to find lines where common changes to the query can be made. The common table expression jobs_selected can be modified as needed. Choose an appropriate source view: JOBS_BY_USER, JOBS_BY_PROJECT, JOBS_BY_ORGANIZATION, or an archive of one of these. Modify the filters as needed.
with jobs_selected as ( select cast(j.creation_time as date) as creation_date ,j.project_id ,j.job_id ,j.start_time ,j.end_time ,j.job_stages -- TODO: comment/uncomment lines as needed to choose jobs source -- from `PROJECT_ID.region-us.INFORMATION_SCHEMA.JOBS_BY_ORGANIZATION` j -- from `PROJECT_ID.region-us.INFORMATION_SCHEMA.JOBS_BY_PROJECT` j -- from `PROJECT_ID.region-us.INFORMATION_SCHEMA.JOBS_BY_USER` j -- from `PROJECT_ID.ARCHIVE_DATASET.JOBS` j -- default to this query's billing project -- from `region-us.INFORMATION_SCHEMA.JOBS_BY_ORGANIZATION` j -- from `region-us.INFORMATION_SCHEMA.JOBS_BY_PROJECT` j from `region-us.INFORMATION_SCHEMA.JOBS_BY_USER` j where j.job_type = 'QUERY' -- LOAD, QUERY, COPY, or EXTRACT -- TODO: modify filters as needed -- JOBS_BY_ tables partitioned on creation_time by day; -- creation_time is UTC and current_date defaults to UTC -- and date(j.creation_time, 'America/New_York') = current_date('America/New_York') and date(j.creation_time) = current_date -- and j.project_id = 'PROJECT_ID' -- only needed by JOBS_BY_ORGANIZATION -- and j.job_id = 'JOB_ID' -- and j.state = 'RUNNING' -- PENDING, RUNNING, or DONE ) ,job_stages as ( select j.creation_date ,j.project_id ,j.job_id ,job_stages_offset ,coalesce( js.id ,0 -- handle bug where first element id is null ) as stage_id -- note that sometimes numbers are skipped ,js.name ,js.status ,case when ends_with(js.name, ': Repartition') then js.input_stages[ ordinal( array_length( js.input_stages ) ) ] else coalesce( js.id ,0 -- handle bug where first element id is null ) end as primary_stage_id ,case when ends_with(js.name, ': Repartition') then 1 else 0 end as repartition_flag ,timestamp_millis(js.start_ms) as start_time ,timestamp_millis(js.end_ms) as end_time ,js.parallel_inputs ,case when js.parallel_inputs is not null then coalesce( js.completed_parallel_inputs ,0 ) else js.parallel_inputs end as completed_parallel_inputs ,js.wait_ms_avg ,js.wait_ms_max ,js.read_ms_avg ,js.read_ms_max ,js.compute_ms_avg ,js.compute_ms_max ,js.write_ms_avg ,js.write_ms_max ,js.slot_ms ,js.shuffle_output_bytes ,js.shuffle_output_bytes_spilled ,js.records_read ,js.records_written ,js.input_stages ,( select string_agg( substr( s.substeps[ ordinal( case when array_length(s.substeps) = 1 then 1 else 2 end ) ] ,6 ) ,'\n' order by offset ) from unnest(js.steps) s with offset where s.kind = 'READ' ) read_from ,( select string_agg( concat( s.kind ,'\n' ,( select string_agg( substeps ,'\n' order by offset ) from unnest(s.substeps) substeps with offset ) ) ,'\n\n' order by offset ) from unnest(js.steps) s with offset ) as steps from jobs_selected j join unnest(j.job_stages) js with offset job_stages_offset ) ,primary_job_stages as ( select js.project_id ,js.job_id ,js.primary_stage_id as stage_id ,min(js.start_time) as start_time ,max(js.end_time) as end_time ,sum(js.slot_ms) as slot_ms ,sum( repartition_flag ) as repartition_count from job_stages js group by js.project_id ,js.job_id ,js.primary_stage_id ) ,all_intervals as ( select ai.project_id ,ai.job_id ,ai.start_time ,ai.end_time from ( select se.project_id ,se.job_id ,se.a_time as start_time ,lead( se.a_time ,1 ,null ) over ( partition by se.project_id ,se.job_id order by se.a_time ) as end_time from ( select js.project_id ,js.job_id ,js.start_time as a_time from job_stages js union distinct select js.project_id ,js.job_id ,js.end_time as a_time from job_stages js ) se ) ai where ai.end_time is not null ) select js.creation_date ,js.project_id ,js.job_id ,js.job_stages_offset ,js.stage_id ,js.primary_stage_id ,js.name ,pjs.stage_id is not null as is_primary_stage ,js.status -- with this format, value can be treated as datetime in Excel; -- format in Excel as yyyy-mm-dd hh:mm:ss.000 ,format_timestamp( '%Y-%m-%d %H:%M:%E*S' ,js.start_time ) as start_time ,format_timestamp( '%Y-%m-%d %H:%M:%E*S' ,js.end_time ) as end_time ,timestamp_diff( js.end_time ,js.start_time ,millisecond ) as elapsed_ms ,js.slot_ms ,pjs.repartition_count ,format_timestamp( '%Y-%m-%d %H:%M:%E*S' ,pjs.start_time ) as related_start_time ,format_timestamp( '%Y-%m-%d %H:%M:%E*S' ,pjs.end_time ) as related_end_time ,timestamp_diff( pjs.end_time ,pjs.start_time ,millisecond ) as related_elapsed_ms ,pjs.slot_ms as related_slot_ms ,js.parallel_inputs ,js.completed_parallel_inputs ,cast(js.completed_parallel_inputs as float64) / nullif(js.parallel_inputs, 0) as completed_parallel_inputs_percent ,js.wait_ms_avg ,js.wait_ms_max ,js.read_ms_avg ,js.read_ms_max ,js.compute_ms_avg ,js.compute_ms_max ,1.0 - js.compute_ms_avg / nullif(js.compute_ms_max, 0) as compute_skew ,js.write_ms_avg ,js.write_ms_max ,js.shuffle_output_bytes ,js.shuffle_output_bytes_spilled ,js.records_read ,js.records_written ,array_to_string( ( select array_agg( cast(input_stages as string) order by input_stages ) from unnest(js.input_stages) input_stages ) ,',' ) as input_stages ,js.read_from ,substr( js.steps ,1 ,32767 -- maximum number of characters an Excel cell can contain ) as steps from job_stages js left join primary_job_stages pjs on pjs.project_id = js.project_id and pjs.job_id = js.job_id and pjs.stage_id = js.stage_id order by js.creation_date ,js.project_id ,js.job_id ,js.stage_id ;
The query preserves a row for all stages as Google reports them.
However, it adds some fields pertaining to the idea of
“primary” stages and “related” stages. A
“primary” stage is a non-repartitioning stage (e.g., an
INPUT
stage, JOIN
stage, etc.).
And a single set of “related” stages are a single
“primary” stage and all of the repartitioning stages
related to it. Sometimes it is useful to see all of the stages, and
sometimes it is more useful to focus on the primary stages. The
is_primary_stage computed field can
be filtered on to focus on just the primary stages. And there are a
set of computed “related” metrics that are intended to
represent the metrics over the interval of a single set of related
stages.
Fields
Cast creation_time as date in UTC time zone.
Job project_id.
Job job_id.
Zero-based offset of job_stages array.
Based on job stage id:
coalesce(id, 0)
. Field id is always
null
for the first element.
Monotonically increasing but not necessarily contiguous.
stage_id of the primary stage for this set of related stages.
Job stage name. While a
job is running, the name may be something misleading like
S00: Output
. But when the job is done,
the name for the same stage will have changed to something
which makes sense like S00: Input
.
Whether or not this stage is a primary stage.
true
or false
.
Job stage status:
'PENDING'
Job encountered error or cancellation after this stage was added but before it started.
'RUNNING'
Interval between job stage start_ms and job stage end_ms.
'CANCELLED'
Job cancelled after this stage was added but before it could complete.
'FAILED'
Error occurred in this stage.
'COMPLETE'
After job stage end_ms; succesful.
Job stage start_ms
converted to timestamp.
null
until the job is done.
Job stage end_ms
converted to timestamp.
null
until the job is done.
end_ms - start_ms
.
Job stage slot milliseconds. This will not reflect the true value until the job has completed.
Number of repartitioning stages associated with the “primary” stage.
Earliest start_time for a set of related stages.
Last end_time for a set of related stages.
related_end_time - related_start_time
.
Sum of job stage slot_ms for a set of related stages. This will not reflect the true value until the job has completed.
Number of completed units of work.
Number of units of work.
completed_parallel_inputs / parallel_inputs
.
Google indicates that it is possible for a stage to
complete without completing all units of work.
Job stage wait_ms_avg.
Job stage wait_ms_max.
Job stage read_ms_avg.
Job stage read_ms_max.
Job stage compute_ms_avg.
Job stage compute_ms_max.
1.0 - compute_ms_avg / compute_ms_max
.
Job stage write_ms_avg.
Job stage write_ms_max.
Job stage shuffle_output_bytes.
Job stage shuffle_output_bytes_spilled.
Job stage records_read.
null
until the job is done.
Job stage records_written.
null
until the job is done.
Comma-delimited list of input stages.
Newline-delimited list of input tables and stages.
Newline-delimited list of steps and substeps.