Message-ID: <365157831.8949.1711709023834.JavaMail.confluence@host3.pipelinefx.com> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_8948_162421424.1711709023833" ------=_Part_8948_162421424.1711709023833 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html Qube 6.3 Complete Release Notes

Qube 6.3 Complete Release Notes

################################################################= ##############
@RELEASE: 6.3.6
##################################= ############################################

=3D=3D=3D=3D CL 10514 =3D=3D=3D=3D
@FIX: another patch for out-of-o= rder issue. Fixed unexpected short-circuit evaluation that was happening in= the startResources() routine

=3D=3D=3D=3D CL 10513 =3D=3D=3D=3D
@FIX: another patch for out-of-o= rder issue. Fixed unexpected short-circuit evaluation that was happening in= the startHost() routine

=3D=3D=3D=3D CL 10512 =3D=3D=3D=3D
@INTERNAL: QbJob object's _subjo= bswaiting data was not being initialized or copied correctly, causing some = job comparisons based on subjobs waiting counts to unexpectedly fail.

=3D=3D=3D=3D CL 10504 =3D=3D=3D=3D
@INTERNAL: added more log output= for debugging builds, added more comments while working on out-of-order is= sue.

ZD: 8198

=3D=3D=3D=3D CL 10477 =3D=3D=3D=3D
@FIX: Another out-of-order fix. = Jobs at the same numerical and cluster priority should dispatch in the corr= ect FIFO order now.

The FIFO enforcing should work most of the time, but there still will be=
occasional out-of-order behavior, due to the multi-threaded nature of= the
supervisor. ("qbshove"-ing the older job should correct= it, when it's seen)

ZD: 8198

=3D=3D=3D=3D CL 10462 =3D=3D=3D=3D
@FIX: yet yet another fix for ou= t-of-order dispatch behavior-- eliminate race-condition that would allow lo= wer priority jobs that were just preempted to get workers before higher-pri= ority jobs.
See also CL10440 10452

ZD: 8198

=3D=3D=3D=3D CL 10461 =3D=3D=3D=3D
@CHANGE: modified/compacted the = multi-line "found a duty to replace" logging to be a single line.=

=3D=3D=3D=3D CL 10452 =3D=3D=3D=3D
@FIX: yet another fix for out-of= -order dispatch behavior-- eliminate race-condition that would allow lower = priority jobs that were just preempted to get workers before higher-priorit= y jobs.
See also CL10440

ZD: 8198

=3D=3D=3D=3D CL 10441 =3D=3D=3D=3D
@FIX: killing an already finishe= d (complete, failed, killed) job leaves the job in the "dying" st= ate.

=3D=3D=3D=3D CL 10440 =3D=3D=3D=3D
@FIX: another fix for out-of-ord= er dispatch behavior-- eliminate race-condition that would allow lower prio= rity jobs that were just preempted to get workers before higher-priority jo= bs.

ZD: 8198

=3D=3D=3D=3D CL 10429 =3D=3D=3D=3D
@FIX: out-of-order job dispatchi= ng issue with jobs using the "+" sign with the "host.process= ors" reservations.

ZD: 8198 8261 8229 8233 8228

=3D=3D=3D=3D CL 10189 =3D=3D=3D=3D
@FIX: timing issue where some wo= rker resources (host.xyz) would disappear after the worker received a remot= e config.

@FIX: issue where supervisor tries to dispatch a subjob to a worker with=
insufficient resources (reduced the likeliness of that from happening= )

@FIX: the above 2 fixes combined should now prevent some of the
out= -of-priority-order dispatch issues, especially in environments where
w= orker resources are deployed.

ZD: 7885

=3D=3D=3D=3D CL 10118 =3D=3D=3D=3D
@FIX: fixed issue where agenda t= imeouts don't work properly on the first agenda item processed by a subjob,= on Unix (Linux/OSX) workers

=3D=3D=3D=3D CL 10117 =3D=3D=3D=3D
@FIX: fixed issue where agenda i= tems that fail because of timeout don't get automatically retried via retry= work
ZD: 7763

=3D=3D=3D=3D CL 10022 =3D=3D=3D=3D
@FIX: modified the worker to onl= y report to the supe of its host status when subjobs are completely done an= d removed, and NOT when they are only marked/scheduled for removal.

This was causing jobs to sometimes run out-of-order, especially when the= re
are many subjobs to each job (such as one subjob per frame), since = that
situation tends to increase the chance of the supervisor dispatch= ing the
same subjob to the same worker. The subjob will be dispatched = to the same
worker, but rejected since the worker thinks it's a duplic= ate assignment of
a subjob that's being removed (and consequently a lo= wer priority job will
get the worker's slot, causing out-of-order job = execution)

ZD: 7601

=3D=3D=3D=3D CL 9903 =3D=3D=3D=3D
@FIX: better message from worker = when it rejects a dispatched subjob because it's a duplicate (being preempt= ed or migrated on the same worker)

=3D=3D=3D=3D CL 9838 =3D=3D=3D=3D
@CHANGE: upped the default value = for supervisor_max_threads to 100, and worker_max_threads to 32

 

 

 

########################################################################= ######
@RELEASE: 6.3.5
##########################################= ####################################

=3D=3D=3D=3D CL 9785 =3D=3D=3D=3D
@FIX: worker issue where desktop = worker would randomly crash.

ZD: 6778

=3D=3D=3D=3D CL 9730 =3D=3D=3D=3D
@TWEAK: modified so that worker n= ame and IP print when job is accepted by worker, in assignJob()

=3D=3D=3D=3D CL 9729 =3D=3D=3D=3D
@INTERNAL: changed all calls to q= bvcout to qbout in the QbDaemon, QbPreforkDaemon and QbDatabaseMysql code, = so that the timestamp, hostname and pid, are always printed.

=3D=3D=3D=3D CL 9698 =3D=3D=3D=3D
@FIX: fixed false-negative warnin= g message pertaining to "select() in checkpoint()" seen in supelo= g.

Examples of these messages:

select() in checkpoint(): Operation timed out
select() in checkpoin= t(): Interrupted system call

=3D=3D=3D=3D CL 9694 =3D=3D=3D=3D
@FIX: fixed issue with the supe t= hreads getting tied up on "subjob X seems to be already assigned"= message.

On a farm with busy workers, the time between the supe dispatching a sub=
job to the worker via assignJob() and the worker reporting that the = "subjob
is running" can be several seconds to sometimes eve= n several minutes, which
was causing many supe threads to attempt dis= patching the same subjob over
and over. All of those threads end up h= itting the "subjob X seems to be
already assigned... retrying&qu= ot; message, and get tied up for 3 seconds while
they retry.

BUGZID:
ZD: 6760 7125

=3D=3D=3D=3D CL 9689 =3D=3D=3D=3D
@FIX: fixed bug in clustering alg= orithm where it incorrectly gave more
weight to a job when the only di= fference was the last letter in the cluster
specification.

For example, if:
host cluster: /3D/projA
job1 cluster: /3D/pro= jB
job2 cluster: /3D

job1 was getting more weight than job2, which is incorrect.

BUGZID: 63740
ZD: 7043

=3D=3D=3D=3D CL 9686 =3D=3D=3D=3D
@FIX: using deprecated "wait= for" attribute with Python api causes qb.submit() to raise a KeyError<= br />@FIX: properly convert "waitfor" value (jobid integer) to pr= oper "dependency" string of "link-done-job-<id>"<= /p>

=3D=3D=3D=3D CL 9676 =3D=3D=3D=3D
@FIX: update documentation and GU= I help text to show correct "||" syntax for job restrictions list= .

=3D=3D=3D=3D CL 9662 =3D=3D=3D=3D
@FIX: supervisor was failing post= flight upgrade scripts on OSX Server, expliclty set the mysql socket to /tm= p/mysql.sock in /etc/my.cnf and /etc/qb.conf to avoid conflicting with the = factory-installed default of /var/lib/mysql/mysql.sock

=3D=3D=3D=3D CL 9615 =3D=3D=3D=3D

@FIX: Added code to properly log frames (to supelog and job log) when th= ey go back to "pending" after the processing subjob/worker is fou= nd dead.

@FIX: Added code in the supervisor to retry a failed worker connectionafter a random 5-10 sec sleep/delay, to alleviate network hiccups durin= g
network commands (kill, preempt, etc. of running subjobs).

ZD: 6760

=3D=3D=3D=3D CL 9614 =3D=3D=3D=3D
@INTERNAL: fixed a small cosmetic= bug introduced in CL 9606

=3D=3D=3D=3D CL 9607 =3D=3D=3D=3D
@INTERNAL: added converseWorkerWi= thRetries() and also fixed small bug in the retry loop of converseSubSuperv= isorWithRetries()

=3D=3D=3D=3D CL 9585 =3D=3D=3D=3D
@FIX: issue where some jobs get s= tuck in the "dying" state when attempted to be killed

ZD: 6616

=3D=3D=3D=3D CL 9570 =3D=3D=3D=3D
@FIX: improvements to the handlin= g of GET_LOCK (aka"reserveJob()") timeout situations.

ZD: 6617

=3D=3D=3D=3D CL 9500 =3D=3D=3D=3D
@FIX: Windows Vista/7/2008-R2 ins= taller - don't error out when installing the worker or supervisor as an Adm= in-equivalent account during creation of scheduled tasks. Properly remove s= cheduled tasks during uninstall.

 

 

 

########################################################################= ######
@RELEASE: 6.3.4
##########################################= ####################################

=3D=3D=3D=3D CL 9550 =3D=3D=3D=3D
@FIX: qbwrk.conf files that had a= ny commented-lines before the first valid template was encountered would ca= use an exception to be raised, QubeGUI->worker->RMB->Configure (wh= ich uses qb.updateworkerconfig()) would fail silently

=3D=3D=3D=3D CL 9535 =3D=3D=3D=3D
@NEW: add submit-agenda-timeout-j= ob.py example python script, to demonstrate submission of a job with frame-= level timeouts.

ZD: 6099

=3D=3D=3D=3D CL 9530 =3D=3D=3D=3D
@FIX:Submitting paths to shotgun = no longer depends on the visibility of output paths to the supervisor.
@FIX:Shotgun submission script fails gracefully & logs a reason as to = why it can't generate a thumbnail when thumbnail creation fails.

=3D=3D=3D=3D CL 9523 =3D=3D=3D=3D
@FIX: fixed issue where the super= visor fails to correctly track the host assignment for subjobs.

Symptom for this included seeing in the supelog, messages like "sta= tusJob(): aberrant report from worker...", then followed by "subj= ob[xxxx] is assinged to worker[] with mac address[00:00:00:00:00:00]".=

These subjobs would then be in the "running" state, but not as= signed to a worker.

=3D=3D=3D=3D CL 9522 =3D=3D=3D=3D
@FIX: removed code that skipped c= ode that made local decision on the supe to test for resource reservations,= for jobs with host.processors set to > 1, delegating the decision-makin= g to the workers and resulting in more network traffic and latency.

ZD: 6141

=3D=3D=3D=3D CL 9507 =3D=3D=3D=3D
@FIX: added more robust code that= talks to the SMTP server when sending out email,
to support some emai= l servers with non-standard response behavior.
ZD: 6209

=3D=3D=3D=3D CL 9504 =3D=3D=3D=3D
@FIX: catch case where sg_path_to= _frames is part of the Shotgun versionName, but the job has no outputPaths = for the first frame; fallback to naming the version "job id: 123 jobNa= me: ..."

=3D=3D=3D=3D CL 9496 =3D=3D=3D=3D
@FIX: catch case when inserting i= n a new cluster into cluster_dim when more than 1 worker exists in the new = cluster; occurs during run of regular_slotcount.sql, doesn't prevent new re= cord from being added, just generates line noise and error emails from cron= ...

=3D=3D=3D=3D CL 9494 =3D=3D=3D=3D
@CHANGE: make explanation of &quo= t;+ | *" in job/host restrictions less ambiguous

=3D=3D=3D=3D CL 9484 =3D=3D=3D=3D
@FIX: calculate cpu-seconds for a= genda-based jobs by summing up work times, not subjobs. Better support for = resetting of the start times for retried work.

=3D=3D=3D=3D CL 9467 =3D=3D=3D=3D
@NEW: add a random offset to the = startup so that all workers don't report at the same time if they've starte= d up at the same time.
@CHANGE: don't retrieve job name, it's extraneo= us and not reported; cuts down the query count by one.
@CHANGE: set wo= rkname for subjob to job.subid, not subid; easier to detect case where an a= genda-based job falsely reports not having an agenda, so subjob id won't co= nflict with a frame number

=3D=3D=3D=3D CL 9463 =3D=3D=3D=3D
@FIX: don't report memory usage i= n the case where MySQL fails to return a valid agenda name, usually caused = by timeouts or maxed out connections.

=3D=3D=3D=3D CL 9456 =3D=3D=3D=3D
@FIX: moved the location of QbTab= leVersion29.cpp (rel-6.3) inside the upgrade_supervisor.vcproj file from th= e incorrect "Resouces Files" folder to the proper "Source Fi= les" folder.

It appeared as though the file was missing from the build.
(probabl= y mostly only cosmetic, but was also was confusing).

=3D=3D=3D=3D CL 9449 =3D=3D=3D=3D
@FIX: fixed issue with removal of= workers using the mac address (i.e. "qbadmin -worker remove <macad= dr>") not working properly.

BUGZID: 63447

=3D=3D=3D=3D CL 9446 =3D=3D=3D=3D
@FIX: added "pgrp" modi= fying support to the supervisor code and the qbmodify() C++ API, qb.modify(= ) Python API, and qb::modify() Perl API routines, and added a "-mpgrp = <int>" option to the qbmodify command-line tool.

BUGZID: 63680

=3D=3D=3D=3D CL 9442 =3D=3D=3D=3D
@FIX: modified to raise exception= when parameter "fields" is not of type list.

BUGZID: 63627
ZD: 3998

=3D=3D=3D=3D CL 9440 =3D=3D=3D=3D
@FIX: variables such as $qb::jobi= d not working in callbacks on Windows

BUGZID: 63686
ZD: 5240

=3D=3D=3D=3D CL 9427 =3D=3D=3D=3D
@FIX: added code to make sure all= end-of-line in email data are CRLF (not just LF) in accordance to RFC2822.=

This was causing notification emails to not work with some email servers= , as they will not responding, and the communicating supe thread would just= stall.

ZD: 5752

=3D=3D=3D=3D CL 9411 =3D=3D=3D=3D
@FIX: added code to chmod and ope= n up the file permission of .out and .err files in the job log folder.

This was causing subjobs to fail on systems with "mounted" job= log path, as the supervisor will initially create these files when when a = subjob that previouly never started is retried (the supe writes "qube!= - retry/requeue on blahblah...") under the "root" user's ow= nership with mode 644, and the workers who get the subjobs can't write to i= t.

ZD: 5965

=3D=3D=3D=3D CL 9402 =3D=3D=3D=3D
@FIX: adding "qbhash" c= ommand to windows.

=3D=3D=3D=3D CL 9395 =3D=3D=3D=3D
@FIX: fixed issue causing the sup= ervisor to crash at initialization, right after "finding other supes..= ." was printed in the supelog.

The fix was in one of the base commuinication library routines QbConnect= ion::receiveUdp().

Sometimes, unknown/malformed data would be received on the UDP socket, a= nd was causing the code to attempt to access beyond the buffer array (index= out-of-bounds error).

ZD: 5638
BUGZID: 63305

 

 

 

########################################################################= ######
@RELEASE: 6.3.3
##########################################= ####################################

=3D=3D=3D=3D CL 9370 =3D=3D=3D=3D
@FIX: recreate the pfx_dw stored = procedures and functions on Windows, as the MSI installer wipes them out du= ring an upgrade.

=3D=3D=3D=3D CL 9342 =3D=3D=3D=3D
@FIX: fixed a supe thread crashin= g issue, when global_host or license_host resource tracking is used.

ZD: 5749

=3D=3D=3D=3D CL 9334 =3D=3D=3D=3D
@FIX: add error handler for MySQL= error 1146 "Table 'x' doesn't exist" for work and cpu time calcu= lations for job data collector script
@NEW: increment datawarehouse ve= rsion to 10 to allow for installing this patch into existing databases

=3D=3D=3D=3D CL 9325 =3D=3D=3D=3D
@FIX: add qbhash program to be in= cluded in qube-core RPM package.

BUGZID: 63693
ZD: 5744

=3D=3D=3D=3D CL 9318 =3D=3D=3D=3D
@FIX: fixed crash bugs that were = introduce when the "dying" state was implemented for 6.3.1.

ZD: 5794

=3D=3D=3D=3D CL 9311 =3D=3D=3D=3D
@FIX: add mail template for auto-= wrangling emails to the installers

 

 

########################################################################= ######
@RELEASE: 6.3.2
##########################################= ####################################

=3D=3D=3D=3D CL 9265 =3D=3D=3D=3D
@FIX: fixed job-level history not= being recorded into .hst file.

(Bug was introduced in CL9145, 9146)

ZD: 5609

=3D=3D=3D=3D CL 9261 =3D=3D=3D=3D
@CHANGE: cut down on the cmdline = & cmdrange jobtypes' stdout; don't print 'LOG: ...' lines, make regex s= ummaries much clearer, change printing or regex's to stderr to make it clea= rer that they're not actual errors, but rather things being searched for in= the stderr stream.

=3D=3D=3D=3D CL 9252 =3D=3D=3D=3D
@FIX: properly find qb.conf on Wi= ndows versions Vista and later when unable to contact the supervisor direct= ly.

=3D=3D=3D=3D CL 9245 =3D=3D=3D=3D
@FIX: GUI changes to be able to h= andle when supervisor host goes down, and both supervisor and MySQL server = are unavailable. Also fix jobList not refreshing on down supervisor.

=3D=3D=3D=3D CL 9241 =3D=3D=3D=3D
@FIX: fix GUI crashbug in MySQLCo= nnect when supervisor does not answer a qb.ping

=3D=3D=3D=3D CL 9239 =3D=3D=3D=3D
@FIX: global resource tables were= not getting created in new instances of the datawarehouse db, only on upgr= ades.

=3D=3D=3D=3D CL 9234 =3D=3D=3D=3D
@FIX: disable permission check of= worker_logpath, as it was creating false-alarms and putting the worker to = be in panic mode unnecessarily.

ZD: 5445 5236
BUGZID: 63683

=3D=3D=3D=3D CL 9232 =3D=3D=3D=3D
@FIX: fixed example python code (= jobSubmit06.py) to work on Windows too.

=3D=3D=3D=3D CL 9211 =3D=3D=3D=3D
@FIX: added code to prevent the Q= bQueue::getSubjobReadyfindReady() routine from returning the same subjob to= be dispatched over and over.

This was causing the findSubjobAndReserveJob() and startJob() routines t= o
hit the "subjob [N] seems to be already assigned" situatio= n, and cause
threads to enter a long, sometimes semi-infinite, sleep-a= nd-retry loop.

Fixed by adding code in the startJob() routine to quickly update the sub= job
status when the the assignJob() returns QB_ASSIGN_OK (i.e., worker= says it
has accepted the subjob), instead of waiting until the worker= later reports
that the subjob is "running" via the STATUS_J= OB message, which can take
more than several seconds on a busy farm.

Also reduced the number of maximum retries to 3 (MAX_ATTEMPTS), in thesituations where a subjob "seems to be already assigned" or w= hen a worker
host says it's busy (QB_ASSIGN_BUSY). This prevents the t= hreads to get
stuck for 10 or more seconds in a sleep-retry loop, and = allow them to give
up quickly and move on.

ZD: 5449

=3D=3D=3D=3D CL 9198 =3D=3D=3D=3D
@FIX: fixed issue with non-node-l= ocked licenses ("FF:FF:...") not working (since 6.3.0)

=3D=3D=3D=3D CL 9173 =3D=3D=3D=3D
@FIX: ensure that mail sent by qb= amdin --emailtest is RFC2822-compliant (no bare LF's, only CRLF)

 

 

 

########################################################################= ######
@RELEASE: 6.3.1
##########################################= ####################################

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

@NEW: Add CentOS/RHEL 6 x64 support

=3D=3D=3D=3D CL 9150 =3D=3D=3D=3D
@INTERNAL: QbDebug::filename(QbSt= ring) took if statement out, so resetting _filename is allowed

=3D=3D=3D=3D CL 9145 =3D=3D=3D=3D
@FIX: disabled logging to /var/sp= ool/qube/{host,user}, as it was creating large log files and causing sluggi= sh performance.

An option to enable these logs may be made available in the future.

=3D=3D=3D=3D CL 9142 =3D=3D=3D=3D
@FIX: fixed issue where global re= sources tracking drift sand more subjobs than can be accomodated by the act= ual global resource count is dispatched.

ZD: 5074

=3D=3D=3D=3D CL 9133 =3D=3D=3D=3D
@INTERNAL: CentOS support for &qu= ot;buildpyc" in rpm/quberpm.pm

=3D=3D=3D=3D CL 9105 =3D=3D=3D=3D
@NEW: A new transitional "dy= ing" state for jobs that have been ordered to be "killed", b= ut still being processed by the system

=3D=3D=3D=3D CL 9085 =3D=3D=3D=3D
@INTEG: main -> rel-6.[0,1,2,3= ] CL 9083, 9084
-----
@CHANGE: increase MySQL wait_timeout value = from default of 8 hours to 36 hours to decrease frequency of "MySQL se= rver has gone away (2006)" error messages.
@CHANGE: increase MySQ= L max_allowed_packet value from default of 1MB to 64MB to decrease frequenc= y of "MySQL server has gone away (2006)" error messages.

=3D=3D=3D=3D CL 9084 =3D=3D=3D=3D
@CHANGE: increase MySQL max_allow= ed_packet value from default of 1MB to 64MB to decrease frequency of "= MySQL server has gone away (2006)" error messages.

=3D=3D=3D=3D CL 9066 =3D=3D=3D=3D
@FIX: fixed "cpus" (sub= job) count inaccuracy when a job's "cpus" was modifed down and th= en up.

For example, if a job with initially 10 "cpus" was reduced to = 5, then
subsequently increased to 6, the system had inaccurately recom= puted the
subjob count to be 10.

=3D=3D=3D=3D CL 9058 =3D=3D=3D=3D
@FIX: renaming logs during rotati= on would fail on Windows

=3D=3D=3D=3D CL 8939 =3D=3D=3D=3D
@FIX: fixed another small "h= ole" that could cause race-conditions to dispatch a single subjob more= than once

ZD: 4783
BUGZID: 63657

=3D=3D=3D=3D CL 8937 =3D=3D=3D=3D
@FIX: supe issue where the same s= ubjob can be dispatched more than once to worker(s).

ZD: 4783
BUGZID: 63657

 

 

 

########################################################################= ######
@RELEASE: 6.3.0
##########################################= ####################################

=3D=3D=3D=3D CL 9013 =3D=3D=3D=3D
@NEW: added description of superv= isor_job_flags in the qb.conf.template file

=3D=3D=3D=3D CL 9010 =3D=3D=3D=3D
@FIX: fixed memory bloat issue in= supervisor threads on start up, on farms with many jobs.
In some case= s, it had been reported that each supe thread was taking up 500+ MB.

=3D=3D=3D=3D CL 8975 =3D=3D=3D=3D
@NEW: add section (8.7) for "= ;externally updatable worker resources and properties" to Administrati= on.doc

=3D=3D=3D=3D CL 8957 =3D=3D=3D=3D
@NEW: add user name to print to s= upelog when a worker lock is updated

BUGZID: 63661
ZD: 4860

=3D=3D=3D=3D CL 8949 =3D=3D=3D=3D
@FIX: fix datawarehouse crontab s= o that 7-day tables are rebuilt twice a day

=3D=3D=3D=3D CL 8948 =3D=3D=3D=3D
@NEW: add global_resource trackin= g to the datawarehouse

=3D=3D=3D=3D CL 8935 =3D=3D=3D=3D
@FIX: update qb.conf templates to= show the correct default value for supervisor_default_security
@INTER= NAL: previous setting was an hex value, which seems to be unsupported now.<= /p>

=3D=3D=3D=3D CL 8910 =3D=3D=3D=3D
@NEW: add C++ examples for using = the qbupdateworkerresource(), qbupdateworkerproperties(), qbdeleteworkerres= ources(), and qbdeleteworkerproperties() routines

=3D=3D=3D=3D CL 8909 =3D=3D=3D=3D
@NEW: add Perl API routines for e= xternally updated worker resources/properties

* add bindings to perl
** add qb::updateworkerresources() and updat= eworkerproperties() to perl api
qb::updateworkerresources("shiny= ambp.local", "host.ooga=3D2/3,host.extern=3D0/10")
qb:= :updateworkerrproperties("shinyambp.local", "host.oogaprop= =3D3,host.oogaextprop2=3D11")

** add deleteworkerresources() and deleteworkerproperties() to perl
qb::deleteworkerresources($host, @resources);
qb::deleteworkerresou= rces("shinyambp.local", "host.extenres", "host.oog= a");

=3D=3D=3D=3D CL 8901 =3D=3D=3D=3D
@FIX: fixed bug where subjobs wil= l be retried indefinitely when retrysubjob is set.

BUGZID: 63517
ZD: 2950 4661

=3D=3D=3D=3D CL 8889 =3D=3D=3D=3D
@FIX: fixed issue where the super= visor kept adding duplicate auto-wrangling and mail callbacks when jobs are= resubmitted

BUGZID: 63655
ZD: 4661

=3D=3D=3D=3D CL 8886 =3D=3D=3D=3D
@INTEG: rel-6.2 -> main
-= ---
@FIX: properly remove datawarehouse scheduled tasks for round-robi= n tables

=3D=3D=3D=3D CL 8885 =3D=3D=3D=3D
@FIX: properly remove datawarehou= se scheduled tasks for round-robin tables

=3D=3D=3D=3D CL 8872 =3D=3D=3D=3D
@FIX: issue introduced in 6.2.1 t= hat broke callbacks (not being triggered)

=3D=3D=3D=3D CL 8859 =3D=3D=3D=3D
@FIX: add bookmarks (TOC) to Admi= n docs, update section for qblock to refer to "Users guide" inste= ad of non-existent "Command Reference"

=3D=3D=3D=3D CL 8857 =3D=3D=3D=3D
@NEW: add externally-updatable wo= rker resources and properties

BUGZID:

=3D=3D=3D=3D CL 8847 =3D=3D=3D=3D
@CHANGE: upgrade_config tool no l= onger comments out some of the customized paths in qb.conf

ZD: 4470

=3D=3D=3D=3D CL 8846 =3D=3D=3D=3D
@FIX: supe and worker RPMs now co= rrectly "require" specific qube-core version (like "6.2-1&qu= ot;)

BUGZID: 63644
ZD: 4470

=3D=3D=3D=3D CL 8841 =3D=3D=3D=3D
@FIX: issue with supervisor threa= ds stalling, waiting for NFS I/O on the "mounted" job logs, when = NFS latency is large.

=3D=3D=3D=3D CL 8840 =3D=3D=3D=3D
@UPDATE: "Use" doc with= p-agenda documentation
@UPDATE: also added/updated some qbsub example= s

BUGZID: 63636

=3D=3D=3D=3D CL 8837 =3D=3D=3D=3D
@NEW: add example scripts to demo= nstrate submission of p-agenda jobs in perl and python

BUGZID: 63636

=3D=3D=3D=3D CL 8836 =3D=3D=3D=3D
@NEW: adding docs for retryworkde= lay (qbsub option)

=3D=3D=3D=3D CL 8811 =3D=3D=3D=3D
@FIX: fixed worker installer to s= tart the worker service iff the system has not already turned it OFF via ch= kconfig.

ZD: 4286

=3D=3D=3D=3D CL 8798 =3D=3D=3D=3D
@NEW: optimization when submittin= g big groups of jobs via qbsubmit() loaded with callbacks and dependencies<= br />Fixed reported issue where submission performance will degrade linearl= y proportional to the number of jobs in the queue.

=3D=3D=3D=3D CL 8795 =3D=3D=3D=3D
@UPDATE: added descriptions of ne= w/missing qb.conf parameters to the qb.conf.template file, which is used to= build the default qb.conf.

* added p-agenda params (supe and client)
* added auto-wrangling pa= rams (supe)
* added per-user/pgrp subjob limit params (supe)
* ad= ded mail setup params (supe)
* added database setup params (supe)

=3D=3D=3D=3D CL 8794 =3D=3D=3D=3D
@NEW: add p-agenda submission opt= ions to qbsub (p_agenda, p_priority, and p_cpus), and updated online help t= ext.

=3D=3D=3D=3D CL 8790 =3D=3D=3D=3D
@CHANGE: Python API qb.reportjob(= ) now takes a subjob object (dict). It can still take just the status (stri= ng).

This should enable the custom jobtype back-end programmer to pass back s= ubjob-level "resultpackage" data to the supe, for example.

=3D=3D=3D=3D CL 8783 =3D=3D=3D=3D
@NEW: add supervisor_p_agenda_max= qb.conf parameter, for the site-admin to control the maximum number of p-a= genda any job can have.

=3D=3D=3D=3D CL 8782 =3D=3D=3D=3D
@NEW: add p_agenda_cpus to enable= control of the number of "cpus" used for the p-agenda jobs. Defa= ults to number of p-agenda items.

@CHANGE: removed code that automatically makes a job to become a p-agend= a job when
p_agenda_priority() is set. The "p_agenda" list o= r the "p_agenda" job flag must be explicitly
set for a job t= o be a p-agenda job.

=3D=3D=3D=3D CL 8781 =3D=3D=3D=3D
@CHANGE: if an agenda-based job s= pecifies the p_agenda_priority, then automatically add the p_agenda flag.

@CHANGE: added code to check that the job being submitted is an agenda-b= ased one, before doing the p-agenda magic

=3D=3D=3D=3D CL 8775 =3D=3D=3D=3D
@UPDATE: doc update w/ "qbha= sh" and encrypted DB password descriptions

@UPDATE: Added section for qbhash, and updated section for qblogin.
@UPDATE: section for database_password

BUGZID: 63383 63628 39741

=3D=3D=3D=3D CL 8769 =3D=3D=3D=3D
@NEW: add "qbhash" tool= , used to generate/display encrypted passwords

@NEW: add "-password" option to qblogin, to specify password i= n a command-line option instead of on the stdin

BUGZD: 63383

=3D=3D=3D=3D CL 8767 =3D=3D=3D=3D
@FIX: install datawarehouse plist= s on OSX (missing from installer package)

=3D=3D=3D=3D CL 8764 =3D=3D=3D=3D
@NEW: add p-agenda (p-frames, &qu= ot;p" stands for Priority/Preview/Poster) support, where a select few = agenda items of a job can be sent at a higher priority for quicker turn aro= und for previewing purposes.

To use in API: set the "p_agenda" job flag when submitting an = agenda-based job.

Optionally attach a list, job['p_agenda'] in python API, to the job on s= ubmission to explicitly specify the p-agenda items. If not set explicitly, = the system will automatically choose the 1st, last, and middle items to be = rendered at a higher priority.

The priority of the p-agenda items may also be specified on submission, = by setting the job's p_agenda_priority parameter.

p-agenda job support for the standard submission tools (GUI, qbsub) comi= ng shortly.

@NEW: qb.conf parameters: client_p_agenda_priority, supervisor_default_p= _agenda_priority (default 1)

=3D=3D=3D=3D CL 8760 =3D=3D=3D=3D
@UPDATE: Administration.doc with = details about the new worker_boot_diagnostic_retries and worker_boot_diagno= stics_retry_interval parameters

BUGZID: 63600

=3D=3D=3D=3D CL 8755 =3D=3D=3D=3D
@FIX: Added worker_boot_diagnosti= cs_retries and worker_boot_diagnostics_retry_interval

These new configuration parameters tell the worker to automatically retr= y the boot-time
diagnostic routines for "worker_boot_diagnostics_= retries" times, with
"worker_boot_diagnostics_retry_interva= l" seconds of sleep time inbetween the retries.
By default, they = are set to 1 and 30 (seconds) respectively. These values may be
set in= the local qb.conf file, or in the qbwrk.conf file.

@FIX: issue where worker will "panic" when proxy settings are = set in the remote qbwrk.conf file.

BUGZID: 63600 63422 63407
ZD: 3650 1638 2035

=3D=3D=3D=3D CL 8743 =3D=3D=3D=3D
@NEW: add qb.frontend package, wi= ll serve as base class for constructing jobs for new python jobtypes

=3D=3D=3D=3D CL 8727 =3D=3D=3D=3D
@CHANGE: database_password is now= expected to be encrypted.

Plain text password still works, but if a password has been set up to ac= cess the MySQL db, site administrators are
recommended, but not requi= red, that they use "qblogin -display" to generate the encrypted p= assword, and set
database_password in qb.conf to the encrypted string= for more security.

BUGZID: 63628

=3D=3D=3D=3D CL 8722 =3D=3D=3D=3D
@NEW: add optional artificial del= ay before auto-retry of agenda items via "retrywork"

When a failed frame is automatically retried via "retrywork", = an artificial delay may be inserted before the subjob starts processing it.=
Requested by customers to work around issues with, for example, appli= cation license contentions.

Submission APIs (C++, Perl, Python) and clients (qbsub, QubeGUI) modifie= d to allow specifying "retrywork_delay" when submitting jobs.


=3D=3D=3D=3D CL 8717 =3D=3D=3D=3D
@FIX: logs written into a &= quot;hidden" file, in "log/user/.hst", which grows very larg= e

Actions initiated by the supe (as opposed to a particular user), such as=
"starting a subjob on worker", were logged into this hidden= ".hst"
file. Fixed it so the file has a special folder/name= ,
"__QUBE_SYSTEM__/__QUBE_SYSTEM__.hst".

Also modified code so that if the "user" flag was ommitted fro= m the
"supervisor_log_flags", then this user action logging = is disabled
altogether.

BUGZID: 62030

=3D=3D=3D=3D CL 8713 =3D=3D=3D=3D
@FIX: turned off worker debug-lev= el logging that accidentally made it into the 6.2.0 release.

=3D=3D=3D=3D CL 8712 =3D=3D=3D=3D
@FIX: issue where worker processe= s will stall when a config field, such as "worker_description" ha= s quotes in them.

=3D=3D=3D=3D CL 8704 =3D=3D=3D=3D
@FIX: support bash exported funct= ion definitions, which are saved as multi-line environment variable values<= /p>

BUGZID: 63624
ZD: 4100

=3D=3D=3D=3D CL 8702 =3D=3D=3D=3D
@NEW: add perl 5.12 and 5.14 supp= ort for windows x64 and 32-bit.
BUGZID: 63631

=3D=3D=3D=3D CL 8695 =3D=3D=3D=3D
@FIX: export_environment now work= s properly with built-in cmd* jobtypes

@FIX: cmd* jobtype backends will run jobs in a non-login shell if
= export_environment flag is set on the job, to avoid overriding of
env= ironment variables set by the job's submission environment.

@NEW: QbApi::qbsystem() now optionally takes a boolean to specify
= commands to be run in a login shell.

@CHANGE: By default now, QbApi::qbsystem() will run the given command in= a
non-login shell.

@NEW: added optional "shell" parameter to QbEnv::setToEnv(user= , [shell])
method, so the user environment for a non-default shell ca= n be fetched.
This new method is called from QbWorker::QbUnix.cpp now= .

BUGZID: 63625
ZD: 4100

=3D=3D=3D=3D CL 8693 =3D=3D=3D=3D
@FIX: fix "ERROR 1290 (HY000= ) at line 31 in file: '.\create_stored_programs.sql'" on new Windows i= nstallations
@FIX: fix Windows 5.15-beta version specific SQL syntax e= rror (does not exhibit in later versions of MySQL)

=3D=3D=3D=3D CL 8676 =3D=3D=3D=3D
@FIX: add code to license check r= outine to validate hostid against all mac addresses on the host, as opposed= to just the primary one.

Note: this involves changes to the base library (utils/QbList, utils/QbS= erver)

BUGZID: 63621
--
@CHANGE: modify license verification code to = only run when the license file had been changed, or a new day has arrived, = or on boot.

The code still checks to modification time of the license file everytime= that a license access is required but most of the logic is now short-circu= ited, if no mod was made to the file.

It turns out to be rather tricky to, say, add a "reread" optio= n to "qbadmin" to only read the license on demand, since all supe= thread must be told to read the file (for quick access, license data is ke= pt in memory of each thread/proc), and such "broadcast" type of i= nstruction to go out to all threads is not supported at the moment.

The optimization being checked in, however, should significantly reduce = the overhead in license-checking nontheless, especially with the new code w= here each license key's hostid is checked against all mac addresses for val= idation.

BUGZID: 63622

=3D=3D=3D=3D CL 8668 =3D=3D=3D=3D
@FIX: fix "/etc/rc.d/init.d/= supervisor: line 139: [: /var/spool/qube/user/jburk/jburk.hst: binary opera= tor expected" error message in supervisor startup

BUGZID:63618

=3D=3D=3D=3D CL 8662 =3D=3D=3D=3D
@UPDATE: update doc with mail_fro= m parameter description.
BUGZID: 63591

=3D=3D=3D=3D CL 8654 =3D=3D=3D=3D
@FIX: Made the qube-core RPM &quo= t;obsolete" the "qube" package, to
accomodate the chang= e in RPM package name.

BUGZID: 63611
ZD: 3950

=3D=3D=3D=3D CL 8641 =3D=3D=3D=3D
@FIX: added more details to defau= lt qb.conf template's description of proxy_nice_value, and also included ex= planation for Windows.
Also corrected the commented-out default proxy_= account to "qubeproxy" (from "proxyuser") in the same q= b.conf.template.

@DOC: update proxy_nice_value doc accordingly.

=3D=3D=3D=3D CL 8610 =3D=3D=3D=3D
@FIX: issue where supe will insta= ll but not run, due to missing python25.dll file.

=3D=3D=3D=3D CL 8606 =3D=3D=3D=3D
@FIX: The "Start Time" = parameter for SCHTASKS.EXE (/ST option) must be in hh:mm:ss format for earl= ier versions of Windows (notably winxp 32).

=3D=3D=3D=3D CL 8598 =3D=3D=3D=3D
@NEW: add sample perl-based submi= t script that submits jobs with per-work email notification callbacks.
ZD: 3854

=3D=3D=3D=3D CL 8550 =3D=3D=3D=3D
@FIX: rolling back to linking sup= e against python 2.5 for its embedded interpreter instead of 2.7 to avoid r= untime linkage issues with 2.7

BUGZID:

=3D=3D=3D=3D CL 8547 =3D=3D=3D=3D
@FIX: added "post" as p= ossible supervisor_language_flags
@FIX: default for supervisor_manifes= t_flags should be empty

=3D=3D=3D=3D CL 8535 =3D=3D=3D=3D
@CHANGE:Enhanced shotgun integrat= ion in job submission.

=3D=3D=3D=3D CL 8503 =3D=3D=3D=3D
@CHANGE: grant access to the PFX_= *QBTIME* functions to MySQL user "qube_readonly"
@CHANGE: gr= ant the pfx_dw user all rights to the pfx_stats DB

=3D=3D=3D=3D CL 8483 =3D=3D=3D=3D
@CHANGE: add support for non-cmdr= ange type backends, don't require qbTokens

=3D=3D=3D=3D CL 8482 =3D=3D=3D=3D
@NEW: a framework for python-base= d jobtype backends, as well as a base class for jobtypes which use an appli= cation's embedded python terminal prompt
* for use by the Nuke python= jobtype (dynamic allocation)
* used by the intra-frame progress in py= cmdrange
* can be used for Houdini jobytpe dynamic allocation (not man= tra cmd-line renderer though)

=3D=3D=3D=3D CL 8480 =3D=3D=3D=3D
@FIX: fixed issue with perl API w= here the system won't respect the "retrywork" specified in jobs p= rocessed with a perl-based custom jobtype back-end.

@CHANGE: added some useful logging message to print to supelog when retr= ywork is being considered

=3D=3D=3D=3D CL 8472 =3D=3D=3D=3D
@FIX: issue where perl-based cust= om policy didn't work on some systems.

Embedded perl interpreter had to be initialized much earlier than it was= , before the supervisor goes into multi-proc, and
before initializing = customizable modules (algorithm, policy) that rely on it.

ZD: 3718
BUGZID: 63603

=3D=3D=3D=3D CL 8468 =3D=3D=3D=3D
@FIX: (Windows) modified worker m= emory tracking to store values in KB instead of bytes, to avoid buffer over= flow.

ZD: 3308

=3D=3D=3D=3D CL 8466 =3D=3D=3D=3D
@FIX: (OSX) modified worker memor= y tracking to store values in KB instead of bytes, to avoid buffer overflow= .

ZD: 3308

=3D=3D=3D=3D CL 8465 =3D=3D=3D=3D
@FIX: (Linux) modified worker mem= ory tracking to store values in KB instead of bytes, to avoid buffer overfl= ow.

BUGZID: 3308
ZD:

=3D=3D=3D=3D CL 8458 =3D=3D=3D=3D
@NEW: add documentation for the &= quot;Get Next n Jobs" jobList pagination, document the new behavior of= the User filterCtrl, since it now serves as both a display and request fil= ter.

=3D=3D=3D=3D CL 8453 =3D=3D=3D=3D
@NEW: implement the ability to pa= rse the logs on the fly to determine intra-chunk progress
@INTERNAL: c= lean up backend base class and backendUtils in preparation of more wide-spr= ead use

=3D=3D=3D=3D CL 8449 =3D=3D=3D=3D
@NEW: a pure-python implementatio= n of the cmdrange jobtype; it implements intra-chunk progress by parsing th= e output stream from the command as it's being written to disk during the c= ourse of the job, not after the job completes. Progess calculation works on= both single- and multiple-item agenda jobs.

=3D=3D=3D=3D CL 8436 =3D=3D=3D=3D
@NEW: add doc for per-user/pgrp s= ubjobs limits

 

 

------=_Part_8948_162421424.1711709023833--