From Chatbots to Agents: Applying Reasoning to AI

Chris Morris
Oct 7, 2025
7 min read

Updated: Oct 8, 2025

Building Human Logic into AI Agents: Decisions as Supercharged Simulation

One thing that separates run-of-the-mill AI tools from AI agents is reasoning- making logical decisions from the inputs they see. Mastering this is the next frontier, where we move beyond question–response into true automation.

Let’s be frank: many people don’t believe AI can “reason” and pour scorn on any claims that it can. This is because most AI solutions today are not agentic; they’re chatbots. A chatbot is not an agent. Selling a chatbot as an agent is like claiming that having a point-and-speak phrasebook makes you fluent in a language. There’s parroting, not reasoning.

We should also be clear: an AI agent cannot reason exactly like a human. The human brain is hard-wired for intuitive shortcuts and fast heuristics (that’s half the reason advertising works); we can’t expect an agent to do that, because it isn’t wired like us.

If we accept that agents don’t take human shortcuts, they need all the facts. That’s where machine learning (ML) shines: it can simulate outcomes at lightning pace and process millions of datapoints. But ML alone doesn’t give us context or logic.

Couple ML with an AI reasoning layer and you’re cooking with gas. ML assembles and predicts from the facts; AI is trained to filter options, weigh the pros and cons, choose the next best action based on what the agent observes and then verify those decisions against reality.

This is how AI agents reason: not through intuition, but through facts delivered by supercharged simulation. At Tack Tech, we build agents that make logical decisions autonomously with this approach. We’ve handed our agents real media decisioning because we’ve encoded a step-by-step decision process mirroring how human experts think rather than hoping for the best.

The AI agent logic process

If you’re building agents, begin with this playbook:

Gather facts that inform the decision
Weigh options for next actions
Forecast consequences for each option
Check reality and adjust based on outcomes

Rather than keep this theoretical, here’s how we’ve done it in Abe. Reasoning is fundamental to media planning: it starts with known facts (budgets, minimum spends, response/reach curves). But there’s art too; facts become insights, then get blended with intuition and experience to build the optimal plan. It’s where art meets science, where logic meets reasoning.

1) Start by collecting the facts

Logical decisions begin with known facts. For an agent, that means assembling all relevant media context before proposing a plan:

State of the world: objectives, KPIs, constraints, budgets, preferences.
Evidence: documents, databases, MMM/experiments, prior decisions and outcomes.
Assumptions: what’s unknown, contested, or risky (tracked explicitly).

AI is excellent at gathering facts via direct tool queries and curated corpora (e.g., embeddings). We break fact-gathering into specialist sub-agents, each focused on a dataset or tool. For media planning, Abe aggregates:

Planning objectives & suitability: channel–objective fit mapped to funnel stage (awareness, consideration, conversion).
Budget sufficiency: total and phased budgets; minimum viable spends by channel; flighting windows; expected inflation.
ROI/response curves: industry or client-specific curves, plus incrementality where available.
Reach curves: channel/publisher reach–frequency curves and overlap matrices.
Audience consumption & shifts: current consumption patterns and trend deltas (seasonality, competitive activity, platform changes).

Each sub-agent returns a recommendation plus evidence, scoped to its domain. The central planner reconciles them.

Return facts in structured form, not prose, so the planner can validate and query reliably:

{
  "Objective": {"stage": "Awareness", "kpi": "Effective Reach @ 3+", "target_value": 0.65},
  "Budget": {"total": 1500000, "currency": "GBP", "phasing": [{"month": "Nov", "amount": 600000}]},
  "ROI_Prior": [{"channel": "Search", "roas": 4.1}, {"channel": "OLV", "roas": 2.3}],
  "ReachCurve": [{"channel": "OLV", "a": 0.92, "b": -0.005}, {"channel": "CTV", "a": 0.88, "b": -0.004}],
  "OverlapMatrix": {"OLV": {"CTV": 0.32}, "CTV": {"OLV": 0.32}},
  "AudienceConsumption": [{"segment": "A25-44", "platform": "YouTube", "share": 0.27, "trend_delta": 0.04}],
  "Constraints": [{"type": "brand_safety", "rule": "No M-Rated Gaming"}],
  "Assumptions": [{"statement": "CTV inventory stable in Q4", "confidence": 0.7}],
  "Risks": [{"name": "CPM inflation", "likelihood": 0.6, "impact": "medium"}]
}

2) Weigh options by simulating, not guessing

Once the facts are in, the agent should behave like a seasoned planner. Humans have mental shortcuts an LLM can’t replicate. Instead of faking intuition, we exploit the machine’s edge: run huge numbers of credible plan variants, then score them transparently against the brief.

Reasoning, agent-style: simulate first, critique second, propose last. It’s the staggered process the best planners follow at machine speed.

Shortlist first, then mix. For this objective, brand, and audience, which channels even deserve a seat? A premium awareness brief with crafted creative leans to CTV/OLV; a short-term conversion sprint looks different. If a channel can’t serve the KPI or brand tone, it’s out before we touch curves.

From that shortlist, the conversation becomes a mix, not an answer. The agent explores many plausible allocations like a planner sketching on a whiteboard while logging qualitative trade-offs (attention quality vs. learning speed, scale vs. overlap, premium context vs. volatility). The goal isn’t a single winner; it’s a frontier of strong choices where the trade-offs are explicit.

Tiny example - shortlist & generate mixes (illustrative only)

# Channels that fit this objective/brand (toy logic)
OBJECTIVE_FIT = {
    "awareness": {"CTV": 0.95, "OLV": 0.90, "SocialVideo": 0.75, "Display": 0.50, "Search": 0.20}
}

def shortlist(objective, prefs=None, top_k=4):
    prefs = prefs or {}
    scored = [(ch, s + prefs.get(f"pref_{ch}", 0.0)) for ch, s in OBJECTIVE_FIT[objective].items()]
    return [ch for ch, _ in sorted(scored, key=lambda x: x[1], reverse=True)[:top_k]]

import random
def propose_mixes(channels, n=50):
    mixes = []
    for _ in range(n):
        w = [random.random() for _ in channels]; s = sum(w)
        shares = {c: wi/s for c, wi in zip(channels, w)}
        # sparsify tiny tails to mimic real focus
        shares = {c: (v if v >= 0.05 else 0.0) for c, v in shares.items()}
        z = sum(shares.values()) or 1.0
        mixes.append({c: v/z for c, v in shares.items()})
    return mixes

chs = shortlist("awareness", prefs={"pref_CTV": 0.05})
mix_candidates = propose_mixes(chs, n=50)

3) Forecast consequences for each option

A plan is only as good as its implications. Before anything goes live, each candidate is run through models to answer: If we run this, what will likely happen?

We estimate unduplicated reach (diminishing-returns reach curves + overlap), expected ROI/response and budget sufficiency per channel (minimum viable spends, inventory coverage, weekly pressure to hit frequency). No point estimates without context: we show confidence bands based on plausible CPM volatility and supply swings. Where judgement matters (brand appetite for premium context, tolerance for volatility), we encode it as soft preferences so the simulation reflects the real brief.

Tiny example - simulate reach/ROI and check sufficiency

import math

def reach_curve(a, b, spend, cpm):
    imps = (spend / max(cpm, 1e-6)) * 1000
    return max(0, min(1, a * (1 - math.exp(b * imps))))  # b < 0

def unduplicated(reach_by, overlap):
    ch = list(reach_by.keys()); total = sum(reach_by.values())
    for i, c1 in enumerate(ch):
        for c2 in ch[i+1:]:
            total -= overlap.get(c1, {}).get(c2, 0.0) * min(reach_by[c1], reach_by[c2])
    return max(0.0, min(1.0, total))

def roi_response(a, b, spend):  # placeholder for MMM/uplift
    return a * math.log(1 + b * spend)

def evaluate_mix(mix, budget, reach_params, roi_params, cpm, overlap, min_spend):
    spend = {ch: budget * share for ch, share in mix.items()}
    sufficient = all(spend.get(ch, 0) >= min_spend.get(ch, 0) for ch in min_spend)
    r_by = {ch: reach_curve(*reach_params[ch], spend[ch], cpm[ch]) for ch in spend}
    undup = unduplicated(r_by, overlap)
    resp = sum(roi_response(*roi_params[ch], spend[ch]) for ch in spend)
    roas = resp / max(budget, 1e-6)
    vol = 0.12
    band = (max(0, undup * (1 - vol)), min(1, undup * (1 + vol)))
    return {"eff_reach_3plus": undup, "roas": roas, "sufficient": sufficient, "ci": band}

# toy params for illustration
reach_params = {"CTV": (0.80, -0.0000045), "OLV": (0.85, -0.000006), "SocialVideo": (0.75, -0.0000075)}
roi_params   = {"CTV": (0.45, 0.00042),    "OLV": (0.50, 0.00050),  "SocialVideo": (0.55, 0.00055)}
cpm          = {"CTV": 28, "OLV": 14, "SocialVideo": 10}
min_spend    = {"CTV": 40000, "OLV": 25000, "SocialVideo": 15000}
overlap      = {"CTV": {"OLV": 0.32, "SocialVideo": 0.25},
                "OLV": {"CTV": 0.32, "SocialVideo": 0.28},
                "SocialVideo": {"CTV": 0.25, "OLV": 0.28}}

sample_eval = evaluate_mix(mix_candidates[0], 1_500_000, reach_params, roi_params, cpm, overlap, min_spend)

The output we show planners is a small set of mixes on the Pareto frontier; each with clearly spelled-out consequences: what we expect to gain, what we give up, and why the agent prefers one balance over another.

4) Check reality and adjust based on outcomes

Plans don’t end at approval; they start there. The same APIs that assemble the plan also place buys and pull live delivery. As data lands, we compare predicted vs. observed: did CPMs match estimates? Are we trading reach for frequency? Are ROI and KPIs tracking as expected? Are any sub-audiences underperforming?

When reality disagrees, the agent changes its mind. Curves are recalibrated, overlaps tuned, risks repriced; the live mix is nudged within constraints, and future plans inherit better priors. That loop- act → measure → learn is the difference between a chatbot and a planner agent.

Tiny example - nudge plan from live telemetry

def adjust_from_observed(mix, forecast, observed):
    er_obs = observed["eff_reach_3plus"]
    lo, hi = forecast["ci"]
    new = mix.copy()

    # If observed reach falls below forecast band, rebalance toward higher-reach channels
    if er_obs < lo:
        sorted_ch = sorted(forecast["by_channel_reach"].items(), key=lambda x: x[1])
        low, high = sorted_ch[0][0], sorted_ch[-1][0]
        delta = min(0.05, new.get(low, 0))
        new[low] -= delta
        new[high] = new.get(high, 0) + delta

    # Top up channels that are under practical min spend (inventory/pace)
    for ch in observed.get("underfunded_channels", []):
        new[ch] = new.get(ch, 0) + 0.02

    # Renormalise
    s = sum(new.values()) or 1.0
    return {c: v/s for c, v in new.items()}

In Abe, this wires into ad-server/DSP: pace, spend, reach/frequency, ROAS/CPA, attention proxies plus risk monitors (CPM inflation, supply drops). The agent doesn’t overreact to noise; it reacts within confidence, with clear guardrails.

Why this works

Media planning is where art meets science: facts, models, and human-like reasoning coming together to produce outcomes. Machine learning excels at generating scenarios and predictions from vast data, but it struggles to apply the context of brand, objective, and constraints to decide what to do next. AI agents are good at reading context and preferences, but on their own they can’t reliably anticipate what will happen once a plan is executed.

Bring the two together and you get the best of both: ML supplies grounded scenarios; the AI agent applies context, simulates consequences, selects the next best action, then acts and monitors reality to course-correct in-flight.

Agents don’t need to be preprogrammed geniuses; they need the right data, diverse scenarios, and strong guardrails. Reasoning in AI isn’t about intuition, it’s about simulating with rigor and letting reality have the last word.