Taming the AI Beast

Listen (via AI narration) to the Taming the AI Beast blog

14:38

Unless you’ve been living under a rock for the past several years, you’ll be very aware of the profound impact that Artificial Intelligence (AI) has had across all industries – and for good reason. Organisations have recognised the immense efficiencies that business processes can inherit by leveraging AI and Large Language Model (LLM) technologies, and many have eagerly seized opportunities to adopt them. Individuals from across the technical spectrum continue to find new ways to augment their capabilities with AI, achieving levels of productivity that were previously out of reach.

But as with all good things in life, moderation is key. The feverish hype surrounding AI now risks boiling over, as we increasingly see overuse without due regard for its technical limitations. Cynics among us will say it has already crossed that line, with some businesses now scaling back their AI efforts for a variety of concerns: the rampant spread of snake oil AI software, eye-wateringly large API bills, and serious worries about the privacy and security implications of consuming AI services. Others are recognising that human contact forms a critical part of what defines their customer experience, and that removing people from key processes may eliminate a vital point of difference.

In this first post of our AI and Security series, we'll examine both the genuine capabilities and fundamental limitations of current AI systems, and how understanding both is crucial for managing the risks we face today.

The Limits of AI

While the performance of popular LLMs has unquestionably gotten far better over time, hallucinations, bias, and factually incorrect responses remain common problems. Why is this?

LLMs have a strong understanding of language patterns and statistics, but they lack the essential capabilities required for true reasoning:

Creativity: They're limited to recombination of existing knowledge rather than genuine creative synthesis.

Truthfulness: They have no measure of truth, lacking an understanding of causality beyond correlation.

Inference: Their ability to infer user context beyond what is explicitly provided in a prompt remains limited.

Grounding: They have no situational awareness, values, emotions, or real-world grounding beyond what exists in their training data.

These capabilities form vital components of the reasoning process that humans use to make decisions. Our decision-making isn't based on knowledge alone – it includes contextual understanding, emotional intelligence, and real-world experience. Humans operate within a dynamic, multi-modal societal context that changes constantly, while LLMs remain constrained to the static context embedded in their training data and whatever external integrations their providers have made available.

But what about the reasoning models developed by OpenAI, DeepSeek, and Anthropic? Well, they employ chain-of-thought techniques, such as:

Query reformation and decomposition: Breaking complex problems into smaller, more manageable parts.

Cross-domain pattern recognition: Identifying relevant patterns across different areas of knowledge.

Context integration from multiple sources: Synthesising information from diverse contexts.

These models generate detailed intermediate reasoning steps before producing final answers, consuming more computational resources to work through problems step by step. While this appears to address some of the fundamental limitations, recent Apple research has also found significant problems with this approach. In 'The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity,' Apple researchers tested reasoning models using logic puzzles like Tower of Hanoi and found three distinct performance regimes:

Simple tasks: Reasoning models underperformed standard LLMs.

Medium-complexity problems: They showed clear advantages.

Complex novel variants: Both types of models suffered complete performance collapse.

As problems grew more complex, reasoning models initially increased their reasoning effort but then dramatically reduced it when approaching the threshold of their capability. Even when researchers supplied the exact algorithms required to solve problems, the models still failed on the more challenging variants.

While the models produce seemingly elaborate chains of thought, they aren't engaging in genuine logical reasoning. Instead, they perform sophisticated pattern matching, accompanied by verbose explanations that obscure their fundamentally statistical nature. As Thomas Dietterich noted in his presentation, 'What’s Wrong with Large Language Models, and What We Should Be Building Instead':

They are actually not knowledge bases, but they are statistical models of knowledge bases.

What This Means for You

With an understanding of AI’s limitations, it becomes easier to see why we must be exceptionally cautious in how we use it:

Creative productions: "AI Slop" has developed a recognisable signature: a lot of words with little substance, subtle (or sometimes blatant) technical inaccuracies, and audiovisual works that lack creativity or soul. For organisations or individuals who are seeking to position themselves as experts, this carries serious reputational risks and can undermine credibility. The more human expertise that is initially provided in an assistant conversation, the better quality the output.

Data analysis: LLMs provide limited explanation of their analytical process, creating a potentially troubling "black box" effect in contexts that require transparency and accountability. Coupled with their propensity to hallucinate, this raises the risk of incorrect or unjustified decisions, with no clear way to trace or understand how those conclusions were reached.

Software development: LLMs rely on statistical pattern recognition rather than true code comprehension. Beyond the risk of leaking intellectual property, AI-led development introduces several practical concerns. Models trained on older data may generate code that depends on outdated or vulnerable libraries, while newer frameworks might be handled by sheer guesswork. Context window limitations prevent models from considering the broader architecture of large codebases, potentially introducing critical flaws such as broken authentication or authorization. But most critically, AI-generated code can appear perfectly valid while embedding subtle security vulnerabilities, often replicating unsafe patterns found in training data.

Extended interactions: Long conversations noticeably suffer from “model collapse”, where accumulated chat context pollutes future outputs. This can lead to outputs that drift from the original topic and become increasingly incoherent. For tasks that depend on multi-turn exchanges – such as iterative drafting, troubleshooting, software development or detailed analysis – this undermines the reliability of results and increases the risk of flawed decisions. Breaking a larger conversation into multiple, more manageable components helps to sharpen the context and reduce the risk of responses becoming unwieldy.

A common factor throughout these concerns is the danger of overreliance on AI systems. Without careful management this dependency may lead to the atrophy of fundamental human cognitive abilities, as our mental faculties require regular exercise to remain sharp. While the temptation to outsource thinking tasks may promise time and effort savings, excessive dependence impairs critical thinking skills, erodes grammatical and practical abilities, and reduces capacity for independent operation. The effect is not unlike how autocorrect and grammar assistants have contributed to declining language proficiency, or how our reliance on GPS has dulled our spatial awareness and memory. Notably, Harvard Business School research suggests that both expert and novice users are equally susceptible to being persuaded by plausible sounding (but inaccurate) AI responses, contributing to diminished analytical rigour and potentially fuelling the spread of misinformation.

How to Address This

There are several key actions that need to be taken to mitigate the risk:

Educate Your Staff

AI has become a common tool that most technical professionals will use to some extent, yet there is limited widespread understanding of how it works and what its limitations are. This gap is why we are seeing the technology misapplied to use cases beyond its capabilities, producing output that fails to accurately reflect the personality, values and experience of the operator. It is essential that staff approach assistant output with healthy scepticism and that teams require peer review of any content produced by AI.

Meet the Needs of Your Staff

Building on this, many professionals now expect to have access to an AI assistant – such as ChatGPT or Claude – just as they’d expect to use effective note-taking software. If these expectations are not met, shadow IT is almost certain to emerge, as staff turn to unsanctioned tools they see as necessary to do their jobs. This may include free-tier products that collect user data, or tools developed by adversarial nations, potentially placing your data at serious risk of compromise. It is therefore in the best interests of your security to provide approved AI tooling that satisfies both your security requirements and the needs of staff.

Formulate a Robust AI Usage Policy

An AI Usage Policy serves as the foundation for setting the guardrails that protect your users and their data against the risks associated with AI platforms. It provides clear decision-making criteria for teams evaluating new AI tools and use cases, ensuring a consistent approach to risk assessment across the organisation. The policy also enables accountability by defining clear responsibilities for AI governance and establishing measurable compliance standards. Your policy should clearly outline:

Approved AI Technology: The platforms and integrations approved for use within the organisation, and examples of technology that are not authorised.

AI Usage Process Policy: How company data, customer-facing data, customer and non-anonymised data may be used with AI platforms.

AI Education: The frequency and methodology required to educate staff on safe, fair and ethical usage of AI.

AI Ethical Use: The framework required to assess the ethical implications, risks and appropriateness of an AI solution.

AI Software Development: The conditions in which AI may be used to aid software development, including the supply of proprietary source code, database access, and the requirement for peer reviews by a senior.

AI Licensing and Supplier Selection: The channels through which AI technology must be licensed, and the standards to which suppliers must meet.

AI Technology Risk Management: The process through which risk assessment is carried out prior to procurement or new use-cases.

Policy Compliance: How the policy is measured and enforced, and how changes and exceptions can be made.

This may be accompanied by a matrix that outlines what technologies are permitted for use and in what circumstances, for example:

Business Area	AI Allowed (Y/N)	Approved Technologies	Approved By	Usage Examples
Software Development	Y	Claude	CTO	Production of code and scripts.  Creation of product documentation.  Generation of tests and dummy data.
HR	N	N/A	Director HR	N/A
Social Media	Y	ChatGPT Hubspot	Director Marketing and Sales	Creation of social media posts.

Exploit the Strengths of LLMs

Given these limitations, we're not advocating against AI use but for strategic deployment where it genuinely excels whilst maintaining rigorous boundaries.

LLMs excel at language transformation – converting technical documentation into user-friendly guides, extracting data from structured formats, restructuring reports for different audiences, or translating complex concepts into accessible explanations. These tasks leverage their statistical understanding of language patterns without requiring the logical reasoning capabilities they fundamentally lack.

Pattern recognition across large datasets represents their strongest practical application. LLMs can identify recurring themes in customer feedback, spot inconsistencies in documentation, or highlight anomalies in text-based data with a capacity that far exceeds human capability for processing unstructured text.

Code scaffolding proves valuable when approached correctly. Producing boilerplate code, generating test cases, or creating documentation templates aligns well with their pattern recognition abilities, though careful human oversight remains essential to catch the subtle flaws they inevitably introduce.

The key lies in matching AI deployment to tasks that align with its statistical nature. Use it for pattern recognition but require human judgement for decisions. Deploy it for content generation but maintain human authority over final approval. Leverage it for research assistance whilst ensuring authoritative analysis remains firmly in human hands. Start by understanding what truly differentiates your business, then deploy AI to support those strengths through operational improvements rather than replacing the unique human elements that set you apart.

Leverage our Expertise

Inde’s approach to AI has always been about doing the work properly. We focus on delivering real outcomes through a structured and thoughtful process. Our AI Engagement Framework supports organisations at every stage of their AI journey. It includes workshops, policy development, business process improvements, proof-of-concept pilots, rollouts and long-term governance. The goal is to help identify where AI can add genuine value while ensuring responsible and secure deployment.

Achieving the AI Platform on Microsoft Azure specialization reinforces this commitment. As the first New Zealand-owned partner to earn this recognition, Inde has demonstrated both technical capability and a strong track record of delivering scalable, secure AI solutions.

If you're considering how AI fits into your organisation, please get in touch.

About the author

Inde Technology

Inde Technology is a New Zealand employee-owned and operated, cloud-first provider of enterprise technology solutions with offices across the country. As a specialist solution provider, we focus on providing leading solutions to our customers based on best-of-breed products delivered by our highly skilled team. We enable our customers to quickly solve challenges, gain insight, and achieve end-user outcomes.

Taming the AI Beast

The Limits of AI

What This Means for You

How to Address This

Educate Your Staff

Meet the Needs of Your Staff

Formulate a Robust AI Usage Policy

Business Area

AI Allowed (Y/N)

Approved Technologies

Approved By

Usage Examples

Exploit the Strengths of LLMs

Leverage our Expertise

Subscribe to our newsletter for more Inde insights

About the author

Inde Technology

COMMENTS