The Tactical Producer's Journal (2024)

This article will focus primarily on the best methods to strategically deploy Software Test Engineers (and their subordinates), on low, mid, and large budget interactive entertainment apps & games for both console & PC platforms (including browser and non-browser based games). With the emergence of increasingly popular low budget F2P games I thought it very timely to share some Selenium automation testing tips and primers for new DevManagers. Additionally, due to the vast budgets AAA games & new console development can cost these days (up to $250 million for a high profile game, and billions of dollars to produce a new console…) inefficiencies & inexperience in your QA Engineering department can unnecessarily cost you big money over time by needlessly performing manual test tasks, bungling your tools, and letting reputation-damaging bugs slip “into the wild”. So I’m going to explain “STE Engineering Best Practices” here to Producers and Development Managers- but this primer is particularly aimed towards those who achieved their management positions through SDE or business management experience, but have never worked in an SDET or gametester role before.

FIRST THINGS FIRST- PLAN OUT YOUR TEST MILESTONES, AND CHOOSE A CUSTOM TESTING STRATEGY THAT BEST FITS YOUR NEEDS

Sr. SDET’s/STE-3/Sr. Tools Programmers should ideally start planning out the internal tools architecture, in addition to identifying & designing requirements for future automation needs under the direction of the Lead SDET. Once the groundwork is drafted out by the Senior STE’s, development of the more advanced custom internal tools can begin in earnest. However, priority #1 should always be ensuring that a working bug report tool (i.e. JIRA and other defect-tracking analogs) plus a fresh database is ready to go for the entire test team on day one of the test cycle initiation portion milestone of the Software Development Life Cycle- although in some cases all that may be needed is simple updates and minor customization of pre-existing toolstacks if they have been successful in past projects produced by that organization and they will be compatible with the new project.

Another critical strategic decision to be made early on by the Lead SDET or QA Manager will be the decision on how the lower skilled Test Associates will go about their own testing duties. Generally, the question is: Should we adopt a strategy of written test cases, directed ad hoc testing, free testing, or other alternatives? Based on my experience, I find the most effective strategy to be a mix of 40% written test cases, 50% directed ad hoc test, and 10% of “free time” edge case testing (Definition of edge cases is: unexpected, unusual, or malicious consumer behavior).

The guarantee of written test cases provides the assurance of certification for specific critical areas. Always remember to plan out at least some written test cases that cover broad scenario “end to end” testing though, not just miniature test cases that cover purely tiny specific actions. Generally, STE-1’s and Assistant Test Lead Associates are good candidates to write up the test cases- this is a very effective way to utilize their upper-middle class level of skills and experience. However, by providing a slightly larger dose of directed ad hoc manual testing, this keep the basic testers both more alert & actively involved whilst providing a more fluent allowance for fluctuations in the examination of specific areas of need on the problem area du jour. Teams that mandate 100% written test case execution tend to turn potentially talented Test Associates into mindless drones, more prone to zoning out and checking off boxes whilst daydreaming than catching any other random bugs they may accidentally come across during execution of the test case checklist.

Unfortunately, many organizations DO NOT allocate “free” manual testing time for creative edge case testing. However, I have found that just a little free time spent on edge case testing produces a great return on investments, and identifies & prevents potentially gamebreaking exploits that may only be discovered by pure tester creativity or even a little 15 minute brainstorm jam session at the end of the shift. It’s better to let your testers to bust these unexpected exploits before release, rather than let some punk kid discover it down the road and post it on YouTube for the gamer or consumer community to learn about & copycat, thereby forcing an emergency patch post-release. This is not only potentially bad PR, but could potentially cause a serious production disruption risk by pulling resources away from your next project- just to deal with an emergency issue from last year’s game!

GET STARTED

STE-2’s at the intermediate level are best utilized initially starting work on automated test harness development early on as well, since this will take a few weeks to develop & customize the code, optimize it, and fully debug the harness setup. That way, early into the test cycle you’ve got a working partial spectrum test harness ready to go and start producing reliable automated OS (operating system) testing results. OS automated test results are among the most critical variety of test automata, because this data is MOST likely to detect and help debug Priority 1 crashes and integral architectural flaws. Unlike the bug reporting tools and database, which need to be fully functional on day 1 for the entire test team to have at least a minimal bug reporting & tracking ability, you don’t always need broad spectrum automated test harnesses immediately- the game/app generally isn’t even going to be completely functionally testable for a few months. So make sure to expand the automated test spectrum in a prioritive manner and adjust the harness accordingly as functionality and features grow in scope over the course of the SDLC.

Jr. STE’s not assigned to writing test cases should first start whipping up elementary test scripts immediately, and then begin training the Senior Testers/AKA Tester-3s on how to deploy these minor preconfigured scripts early on. The earlier you develop consistent & habitual deployment of basic timesaving scripts like this, the quicker you will begin seeing returns on time & money saved by efficiently automating mundane and manual tests & tasks. After the senior testers get the hang of deploying the miniscripts themselves, they themselves should step up and show some leadership ability by then instructing the intermediate testers/AKA Tester-2s on how to deploy these scripts as well.

Generally your greenhorn Test Associate-1’s are going to be basically helpless new fish the first few months on the job and are better left with performing the easiest, most basic manual test tasks- at least until they develop further technical proficiency by gaining a little bit more experience on the job. Whilst expecting excellence and high standards from ALL employees is a nice ideal… from my experience across multiple projects, it’s simply a wasteful process to assign technical testing to raw recruits who are more likely to flub a task they don’t fully understand. This will ultimately just end up wasting the time of the Lead Test Associate correcting who then has to correct their mistakes- when the TA Lead’s time is most effectively served directing and administrating the test associate squad as a whole and ensuring that all TA’s understand the needs of the project and the SOP’s of writing good bug reports. To analogize this point, the Lead Test Associate is akin to a football team’s Assistant Defensive Coordinator- they need to get the entire defense on the same page, while backing up the Head Defensive Coordinator (The Lead SDET). The TA Lead cannot spend 100% of their time purely devoted to developing only the third string benchwarmers and practice squad- at the expense of their role as second-in-command responsible for supervising the entire defense.

This strategy is not meant to disregard the importance of developing the greenhorns, but in my experience brand spankin’ new Tester-1’s and Test Interns truly need several weeks on the job simply absorbing their new technical environment and the esoteric lingo that comes along with it, before they are prepared to take on any technical testing. Frankly, it can be a little bit overwhelming at first, especially if you’ve got no prior experience in CS or QA… Let them learn to walk, before allocating senior resources training them to run. Assistant TA Leads are particularly well suited for helping develop TA-1’s into TA-2’s.

Finally, one of the most critical lessons that PM’s and DevManagers need to learn, is that software test engineers and software development engineers of ALL levels tend to share one very, very bad habit that simply needs to be eliminated from this industry as a whole. This sad creature is known as the “DevBug”. It’s basically a bug report that is only 10% complete, doesn’t even meet the standards of a bug report written by an unpaid intern playtester, and is completely incomprehensible to anybody else stuck trying to interpret this abominable thing. It generally is a bug report titled something like this: “Bug: The Thingie is not wurking.” Repro steps? #1 Start the game. #2 Find the Thingie. #3 It’s broken. Naturally, there’s generally not even a screenshot attached or crash call stack attached either… While the gentleman who wrote the bug may know what he is referring to in his own mind… the Producer doesn’t understand how to appropriately triage/prioritize this report, any tester stuck trying to repro finds himself stuck with an unactionable task, and it’s simply unprofessional software development. Managers- please do the industry a favor and stop allowing this kind of bad behavior. Crack down on this from day zero of the entire Software Development Life Cycle.

APPROACHING THE MIDWAY POINT OF THE TEST CYCLE

(For web browser based games): Selenium automation test tools are useful for web browser-basedtesting and comes in two formats- Selenium IDE which is easy beans Jr. STE level stuff for startup level organizations working on F2P and low budget browser-based games. AAA quality browser-based game projects require STE level 2 skill to best operate the more advanced Selenium WebDriver which does support multiple programming languages. However, from time to time, problems with the automation will pop up, and it is a lot easier to “look under the hood” and debug your WebDriver automation if you’ve chosen to script it in a more traditional scripting language such as JavaScript or Python.

(For console games or PC games requiring network matchmaking): Remember to wait to develop and deploy “Walkers” (environmental gameplay automators) and “Load Stress Tests” until midway through the project. These aren’t going to do much good when you are still in that super buggy Alpha stage we’re discussing right now. Otherwise your environmental gameplay automation will be constantly “falling through the world” and therefore spamming out unnecessary Walker bug reports… rather than mining beneficial analytics and actionable bug reports. Additionally, premature network log in stress testing & automated matchmaking testing will not be super helpful here either. That’s because generally at this point in the SDLC, the game and server haven’t even been developed enough to create reliable performance yet- once again, premature deployment of automated matchmaking testing is simply more likely to spam your Network Engineers with unactionable synchronicity bug reports. That’s because your Gameplay and UI engineers probably haven’t even finished developing and debugging the multiplayer portals and matchmaking tooltabs yet. You’ve got nothing organic, nor reliable to test here yet.

That’s not to say that a little premature stress testing doesn’t completely have it’s uses- if you have a talented, progressive team of server engineers then at least they can start research into “testing upstream” by identifying and predicting future trending problematic areas that they are likely to encounter on the next testing milestone- sometimes asking an SDE/Gameplay Programmer to help them create a miniature MP level on a private build can be quite helpful in testing upstream. But do keep in mind just how inorganic this may be- you will NOT be testing in a true aurora environment yet.

HALFWAY INTO THE TEST CYCLE- IT’S TIME TO BUMP IT UP A NOTCH

Once you get to Beta on large budget games, the overnight log-in stress tests and Walkers should be fully developed and debugged. They should be preconfigured, and include basic concise instructions (or even a simple GUI), so that every Test Associate from level 1 thru 3 can deploy the Walkers and log-in stress testers every evening at End of Shift via your preferred method of choice, so that these two forms of automation are farming while you sleep, and producing results to be evaluated & triaged first thing the next morning. These performance tools should be expected to be (mostly) functional by now, so assign an STE-1 to start running Tools BVT’s just for the first week- Tools BVT is not rocket science, but the first few times you smoketest these tools, it’s better to have a Jr. STE run this BVT rather than a Senior Test Associate- just for now.

The reasoning behind this is because the first couple daily Tools BVT’s are the most likely to reveal bugs in the functional toolstack your team needs & uses; “tools” bug reports are a bit trickier to write up a bug report for than your standard video game bug or web browser spelling error report, so by having and STE run the first couple tool smokes can better establish how the Test Engineer’s prefer formatting of “their” reported bugs to be written up by any Test Associates who end up stumbling across a bug with JIRA, Team Foundation Server, the database, etc.

The logic behind waiting to publish a Tools BVT until this point in a AAA budget test cycle is thus: you don’t really need to waste payroll manhours and spamming people’s inbox by publishing a department wide report early on when all you are rocking is a bug reporting tool and a database. The tools programmers sit next to each other anyways and will already know whats good each day via scuttlebutt. However, once you get to the point where you’re incorporating JIRA, TFS, MTM, integrally designed console-PC interface command prompts, proprietary video capture, beta-user automated crash analytics, Walkers, overnight server stress testing, and scripts of all spices & flavors… at this point EVERYBODY needs to know what is working, every single day.

The intermediate test engineering tasks (i.e. harness maintenance and Walker/log-in stress testing + troubleshooting) should be assigned to an STE 2 to work on for now, under the watchful direction of a Sr. SDET/STE 3 of course. That way a Jedi Master is keeping an eye out on the Jedi Knight to help continue the Knight’s development, at least until until the STE-2 is ready to graduate into Sr. STE level stuff. Once the Tools BVT process is stable, find a strong Senior Tester or two and get them permanently assigned to handle Tools BVT’s for the duration of this portion of the SDLC. Find another strong Senior Test Associate to start Build/OS BVT’s as well.

STARTING TO WRAP UP TESTING

By now your test harness should be finely tuned. Automation Results Evaluation should be initially done by an STE-1 the first couple days just to make sure the procedures have been set up efficiently, but then he should pass off the Automation Results BVT off to another strong Senior Tester Associate to maximize efficiency and minimize waste of the more expensive engineering payroll labor hours. It doesn’t take a programming God like John McAfee or John Carmack to simply assess the automation passes & failures, triage them out appropriately, and publish the Automation BVT report, so a Senior Tester should be ultimately able to take over this role for now, freeing up his previous STE-1 trainer for higher engineering tasks and further career development of that STE-1 to start training on STE-2 level skills.

At this point in the game, there’s only a few more weeks/months until product release. As a PM or DevManager, hold a confidential meeting with the project’s Lead SDET and review the performance of ALL of your STE’s. If somebody is not making the grade, ask the Lead SDET to correct any substandard performance immediately… or replace the underperformer with a veteran Test Engineer from another team who won’t require any “ramp up” time. While there can be, and will be, test engineering mistakes made early on in every single project (hey, we’re all human), now is not the time for mistakes, and as we approach the home stretch there is zero room available for critical errors that could potentially delay or embarrass the product launch- unless you want to get slaughtered on Metacritic.

Right now around now it is also not uncommon for one specific team issue to arise- there may be disagreements between “The Devs” and the QA. SDE’s under a backlog or in “bug jail” may become a bit testy, and increasingly sensitive & defensive towards testers and SDET’s who are simply doing their job by continuing to find bugs. While it’s natural to raise the bug bar a bit towards the end of the SDLC, the veteran DevManager should generally be aware of this scenario already, and the smarter ones will generally side with, and advocate for QA when possible. After all, would you rather your studio be known for releasing “half-baked potatoes”, or releasing polished products? Two major studios (that I will not name) recently released grossly unpolished & megabuggy games that almost completely destroyed their entire studio’s reputation singlehandedly- and it won’t come as a shock to any studio execs or shareholders when consumers are MUCH less eager to purchase the sequel to their (anonymous) franchises.

Finally, now is an excellent time to assign a small, temporary White Hat Hacking Strike Team composed of one Senior Test Associate to try and crack the front end of your game or app, and one STE-2 or Network Engineer to attempt to penetrate the back end. This is an especially critical test task to prevent criminal exposure and fraud prevention if your program features any kind of subscription service, company store, or ingame monetary microtransactions whatsoever. Generally, the code infrastructure and any network features you may have intended is simply not functional enough to organically conduct advanced penetration testing any earlier on in the SDLC before this point right here- which is why White Hat Hacking test coverage is ideally executed at this time.

Don’t wait until “crunch time” to conduct anti-malicious testing though- by that point in the project things are usually too frantic and backlogged that it may be too late to close any critical coding loopholes before project release. If you wait until Crunch Time to discover a potentially critical malicious exploit in your code, the SDE’s and Network Engineers may be forced into an undesirable situation where they have to decide between debugging said security risk, OR cutting and/or delaying attractive features that aren’t complete yet. Don’t allow your team to get placed into this situation. You only get one chance to create a first impression, and holding off popular consumers features until months after launch because you poorly planned comprehensive IT defense testing until it placed you into a jam… is an avoidable disaster that could easily have been prevented by better foresight earlier on, and the adoption of more progressive “upstream testing” methodologies.

”CRUNCH TIME” AND THE FINAL HOME STRETCH OF THE SDLC

All BVT Reporting should be fined tuned by now like a fine German automobile. It should include an assimilated, singular, all-inclusive report, that includes the trifecta of: Automation BVT results, Build/OS BVT results, and Tools BVT results… all on one big fat BVT or IVT report. At this point, this smoketest assignment should best go to your strongest, most senior tester on the entire Test Associate squad- your Top Gun (you may even need two if you are running multibranch integration snaps, in the case of a MASSIVE project such as developing a new console or MMORPG). Note: This should NOT be either the Lead Test Associate. They need to focus on supervising the test associates, not get bogged down smoketesting all day long at the expense of his primary job, which is TA supervision.

The official smoketester, AKA “Top Gun” is generally now going to be in a world of hurt. It’s probably one of the single most difficult jobs in the entire department. They will probably have to wake up at 4AM or 5AM in the morning to complete their smoketesting so that when every other dev and tester casually rolls into the office at the crack of 9AM, everybody’s got a fully comprehensive BVT or IVT waiting in their email inbox with the Green Light or Red Light needed to decide on whether to keep working on that new build all day (or stick with the last known build to have passed BVT). This guy is also entrusted to produce accurate & uncannily reliable results, otherwise development can be halted for an entire day if he makes a single mistake or typo- and everybody in the entire department will know about it. This is unlike SDE’s who screw up and get bug reports sent back daily with no repercussions, and SDET’s who break test harnesses weekly… without reprisal. Additionally, the smoketester is the only Test Associate required to be in regular contact with Senior Dev’s, Sr. STE’s, and the PM’s… and everything he does is incredibly time sensitive to boot.

Finally, it’s not uncommon for coworkers who’ve never had to smoketest a multimillion or multibillion dollar project before to storm into the tester’s office and demand the BVT faster which only disrupts the work and slows down the greenlight for everyone- so any Lead SDET or QA Manager should set official policy to ban that kind of interruptive behavior. Also, it’s bad form for any SDE to ask the smoketester to BVT any “private build” as a personal favor- especially if there’s only a single overwhelmed smoker, rather than a team. The software development engineer can and should test out his personal experiments himself. In conclusion, if you are developing a new console or MMORPG, pick your smoketester(s) well, and don’t forget to thank them for their hard work.

Just before product launch, your STE-1’s should generally be working with the Lead TA to certify compliance, and practicing intermediate level skills so they can make STE-2 on the next project. STE-2’s should be maintaining the automation harness vigilantly and feverishly testing matchmaking automation whilst personally overseeing a special task force of Multiplayer Test Associates. Generally by this late stage in the game right before launch, your STE-3/Sr. SDET’s may finally have some technical debt to catch up on and upgrade, but most importantly they can now expect to occasionally get called in to quickly troubleshoot traditional lower & middle tier issues that junior and intermediate test engineers would normally handle. That’s because if you only have 10 days before the Release Candidate or Compliance Certification milestone, and Junior Jimmy broke the test harness or can’t figure out what’s wrong with Team Foundation Server today… you need your QA Department’s very best programmer to fix that thing within the hour, not let someone dillydally over it for half a day or more.

POST RELEASE

You’ll likely have a Day 1 patch necessary, but barring unforeseen catastrophe, generally these patch “bandaids” will traditionally fall to the duties of the standard SDE’s and the Network/Server Engineers (Error: 37, anyone?)- NOT the STE’s. However, it won’t hurt to have an STE-1 plus the Lead TA oversee this single task just to ensure that the standard Test Associates have properly tested the Day 1 Patch, so the STE-1 can “certify” that the Day 1 Patch has purified the build for release day. There is your last chance to catch any bugs before they go until the wild, so this phase is absolutely critical. There’s nothing quite worse than spending a year of your life proudly devoted to your work… only to find out about a debilitating bug, gamebreaking exploit, or customer data theft for the first time ever on CNN or Kotaku with the eyes of the entire world on your SNAFU.