If I’m a Hiring Manager, How Do I Assess AI Skills Beyond Tool Usage?

Posted on 2026-06-23 14:54:25

Every hiring manager from North Sydney to the Melbourne CBD is currently grappling with the same headache. You have a candidate who claims they are an "AI expert" because they’ve built a personal dashboard using an AI assistant to summarise their grocery list. But is that expertise, or is that just familiarisation?

In the Australian market, the confusion is palpable. According to recent workforce insights from the Tech Council of Australia, we are facing a critical skills gap. Everyone wants AI skills, but nobody can agree on what those skills actually are. If you’re hiring, it is time to stop looking for the person who has the most prompts saved in their notes app and start looking for the person who understands the architecture of the beast.

Defining the Terms: Familiarity vs. Expertise

Let’s be clear before we go any further. If a candidate calls themselves an "AI Engineer" because they spend their day tweaking prompts for a Large language model (LLM), they are not an engineer—they are a power user.

Here is how I define the divide:

AI Familiarity: The ability to use off-the-shelf tools, write effective prompts, and understand the basic "chat" interface of modern models. This is entry-level. It’s useful, but it doesn't move the needle on enterprise-grade infrastructure. AI Expertise: The ability to understand the lifecycle of a model—from data governance and bias mitigation to the cost implications of token usage and the architectural constraints of the underlying LLM. This is what you actually need for your team.

If you don't make this distinction early in the screening process, you are going to end up with a team that can generate nice-looking emails but can’t tell you why your RAG (Retrieval-Augmented Generation) pipeline is hallucinating in production.

The Mid-Career Pivot

The most promising talent pool isn't coming out of fresh undergraduate degrees. It’s coming from the 5-to-15-year experience bracket. These are your existing business analysts, systems engineers, and product leads who are rapidly upskilling.

We are seeing a major shift where online postgraduate study has finally shed its "second-rate" reputation. Institutions like The University of Melbourne have moved to offer high-rigour, online-first certifications that are functionally equivalent to their campus-based programs. When I interview candidates from these streams, they aren't just playing with tools; they are learning the math, the ethics, and the systemic risks.

When you see a CV with a postgraduate credential from a reputable Australian university, it usually signals that the candidate has moved past the "magic trick" phase of AI and into the "systems design" phase.

The Hiring Gap: Assessing Beyond the Surface

So, how do you actually vet them? You need to move away from generic questions and toward structured, rigorous assessment. Here is a breakdown of how to map the difference between tool users and genuine experts.

Metric The AI "User" The AI "Expert" Handling Hallucinations "I try to rephrase the prompt until it gets it right." "I implement chain-of-thought verification, external ground-truth validation, and constrained decoding." Tool Selection "I use ChatGPT for everything." "I evaluate models based on context window requirements, latency constraints, and privacy-first deployments." Data Privacy "I just make sure I don't paste sensitive stuff." "I design workflows that isolate PII (Personally Identifiable Information) before the data hits the model's ingress point."

The System Design Interview: The Litmus Test

Stop asking candidates to "write a prompt." Instead, throw them a system design interview scenario that forces them to grapple with real-world constraints.

Try this: "We need to build an AI-powered assistant for our customer support team to pull data from our internal, legacy database. The model needs to be accurate 99% of the time, and we cannot send our customer data to a public cloud API. How do you design this?"

A candidate with only "familiarity" will talk about uploading PDFs to a chatbot. A candidate with "expertise" will start talking about vector databases, embedding models, local inference options (like running a quantized Llama-3 model on a private instance), and the necessity of a feedback loop for human-in-the-loop (HITL) verification.

Model Evaluation Tasks: Beyond the "Vibe Check"

We’ve all seen the "vibe check"—where a developer asks an LLM to generate code and says "looks good enough." That is a fast track to technical debt. During your hiring process, incorporate model evaluation tasks that are objective.

Give them a set of outputs and a set of constraints. Ask them to build a rubric to score the accuracy. If they can’t explain how to quantify the difference between a "good" answer and an "acceptable" answer using something more robust than "it sounds smart," they don't have the depth you need.

As PwC consultants have noted in recent industry roundtables, the biggest risk in Australian enterprises right now isn't that AI isn't working—it's that it's working inconsistently. Your candidates need to understand how to build guardrails that prevent that inconsistency.

Three Questions to Ask in Every Interview

If you have twenty minutes, spend them on these three questions. They will expose the "prompt engineer" vs. the "AI systems thinker" immediately:

"Tell me about a time an AI model failed to provide the correct answer in a production context. How did you diagnose the root cause?" (You’re looking for evidence of debugging, not just blaming the model.) "How do you evaluate the 'drift' of a model’s performance over time?" (You’re looking for someone who understands that models aren't static—they evolve as data changes.) "When should we NOT use an LLM for a business problem?" (If they answer "Never," end the interview. The best experts know that sometimes a regex or a simple SQL query is better than an expensive, latency-heavy LLM.)

The Bottom Line: Don't Buy the Hype

It is easy to get swept up in the narrative that AI will change everything tomorrow. It won't. AI is a tool, like a database, a cloud platform, or a microservices architecture. It requires the same rigour in hiring that you would apply to a Senior Backend Engineer or a Database Administrator.

Avoid the candidates who talk in Silicon Valley buzzwords. Look for the ones who talk about latency, token economics, data lineage, and security posture. Those are the people who will actually help your business scale, rather than just costing you a monthly enterprise subscription fee.

The skills gap is real, but it’s solvable. Start by setting a higher bar in your own recruitment cycle. You aren't hiring a "prompt writer"; you're hiring someone to ai project leadership best practices help you navigate a fundamental shift in how we handle information. Vet them accordingly.