> For the complete documentation index, see [llms.txt](/llms.txt).
> A full single-fetch corpus is available at [llms-full.txt](/llms-full.txt).
---
title: Build a Slack deep research bot (in-depth)
description: A deep dive into building a durable multi-agent research pipeline triggered by Slack mentions, covering fire-and-forget webhooks, checkpointed steps, three-agent decomposition, and live debugging in Studio.
tags: ["Slack", "Agents", "Webhooks", "Long-running", "Observability"]
date: 2026-06-15
last_verified: 2026-06-15
audience: both
---

Slack gives you 3 seconds to respond to a webhook before it retries. A research pipeline (scoping subtopics, querying Wikipedia, synthesising a report) takes 30–90 seconds. Handling both in the same request handler guarantees timeouts and duplicate runs.

AGNT5 solves this with a **fire-and-forget** model. The webhook handler records the incoming event and replies `200 OK` in under a second. A durable workflow run starts in the background, executes each research stage as a checkpointed step, and calls `chat.postMessage` when the report is ready. If the worker is restarted mid-pipeline, the run replays from the last completed checkpoint with no repeated LLM calls and no lost work.

This cookbook walks through the complete implementation using the `python/slack_deep_research` template. You will understand every moving part: how the Slack envelope is parsed, why three agents produce better results than one monolithic prompt, what "durable" actually means when a worker crashes, and how to use Studio's Events view to debug the bot without adding any logging infrastructure.

<DocsImage
  src="/docs/integrations/slack-cookbook-2.png"
  alt="End-to-end flow diagram: user mentions @ResearchBot in Slack, Slack sends an app_mention webhook, AGNT5 records the event and returns 200 OK in under 1 second. The Deep Research Workflow starts slack_deep_research, parses the request, checks validity — invalid messages exit early, valid messages extract the topic and remove the @mention prefix. The Research Pipeline then runs ScopingAgent, ResearchAgent, and WritingAgent in sequence, finishing with chat.postMessage posting a threaded reply in Slack."
  caption="The fire-and-forget pattern: AGNT5 acknowledges the Slack webhook immediately, then runs the full research pipeline asynchronously and posts the report back as a threaded reply."
  width={1344}
  height={1316}
/>

---

## What you will learn

- How the **fire-and-forget webhook pattern** decouples Slack's 3-second timeout from a 30–90-second pipeline.
- How `ctx.step()` **checkpoints** each stage so a worker restart never re-runs completed work.
- Why a **three-agent pipeline** (scope → research → write) produces more reliable, testable results than a single monolithic prompt.
- How the **bot loop guard** prevents the bot from replying to its own posts and triggering infinite runs.
- How to use **Studio Events** to see every incoming Slack event, inspect its raw payload, and trace it through to the workflow run and individual step outputs.

## What you build

- A Slack App connected to AGNT5 via the Studio integration.
- Three specialised [agents](/docs/build/agents.md): `ScopingAgent`, `ResearchAgent`, `WritingAgent`.
- A durable [workflow](/docs/build/workflows.md) triggered by `slack.app_mention` and `slack.message` events that posts a threaded research report back to Slack.

---

## Core concepts

### Durable workflows and checkpointing

Every call to `ctx.step()` in AGNT5 is a checkpoint. Before executing the step function, AGNT5 writes an intent record to the durable journal. After the function returns, it writes the output alongside it. If the worker process is killed or restarted at any point, the workflow replays. Completed steps return their stored output immediately. Only the pending steps execute again.

For a research pipeline this matters significantly. Suppose the worker restarts after `conduct_research` completes but before `write_report` starts:

```
Without checkpointing:
  All three stages re-run → repeated Wikipedia requests + LLM calls → extra cost and latency

With checkpointing (what AGNT5 does):
  plan_research     → journal hit, return stored output instantly
  conduct_research  → journal hit, return stored output instantly
  write_report      → not in journal yet, execute normally
```

Design around step boundaries: put meaningful work inside `ctx.step()` calls, not between them.

### Multi-agent decomposition

A single "research and summarise this topic for Slack" prompt has to plan, search, and write inside one context window. It cannot be evaluated in isolation, and when the output is poor it is hard to determine whether the planning, the research, or the writing was the weak link.

Breaking the task into three agents with distinct responsibilities solves all three problems:

| Agent | Responsibility | Why it is isolated |
|---|---|---|
| `ScopingAgent` | Plan 3–6 subtopics with a research strategy | Tuned and evaluated without touching research logic |
| `ResearchAgent` | Fetch Wikipedia and web pages for each subtopic | Has access to [tools](/docs/build/tools.md). The other agents do not. |
| `WritingAgent` | Synthesise findings into a Slack-safe report | Enforces format constraints independently of research quality |

Each agent also has its own context window. `ResearchAgent` receives the research plan and nothing else. It does not carry the writing instructions, which would waste tokens and add noise.

### The fire-and-forget webhook pattern

When Slack delivers an event, AGNT5 parses the incoming HTTP request, records the event to the journal, and returns `200 OK` before the workflow function executes a single line. Slack sees a fast response and stops retrying. The workflow runs asynchronously with no connection to the original HTTP request. When the pipeline finishes it posts back to Slack itself.

This means:
- **No Slack timeouts.** The 3-second acknowledgement window is irrelevant to the pipeline duration.
- **No polling or job queues.** The durable journal is the queue.
- **Retry-safe.** AGNT5 uses the event's unique ID to deduplicate retried Slack deliveries.

---

## Prerequisites

- An AGNT5 account. Sign up at [app.agnt5.com](https://app.agnt5.com).
- Python 3.11+ and [uv](https://docs.astral.sh/uv/getting-started/installation/).
- An [OpenAI API key](https://platform.openai.com/api-keys): the template uses `gpt-4o-mini` for all three agents.
- A Slack workspace where you can create and install apps.

---

## Step 1: Install the CLI

```bash
agnt5 auth login
agnt5 auth status   # confirms: Signed in as <your-email>
```

Follow the [CLI install guide](/docs/install-cli.md) if you do not have `agnt5` installed yet.

**Why this step matters:** The CLI authenticates your local machine and gives `agnt5 deploy` the credentials it needs to push your container image to the AGNT5 registry and register the deployment.

---

## Step 2: Create the project

```bash
agnt5 create --template python/slack_deep_research my-research-bot
cd my-research-bot
uv sync
```

**Why this step matters:** The template scaffolds the complete project: `app.py` (worker entry point), `src/slack_deep_research/` ([agents](/docs/build/agents.md), [functions](/docs/build/functions.md), [tools](/docs/build/tools.md), [workflows](/docs/build/workflows.md)). You can deploy immediately without writing boilerplate. The `agnt5.yaml` at the root defines the entry command and language version for reproducible builds.

The project structure you get:

```
my-research-bot/
├── app.py                          # Worker entry point — registers all components
├── agnt5.yaml                      # Deployment config (entry command, language version)
├── pyproject.toml
├── .env.example                    # Environment variable template — copy to .env
└── src/slack_deep_research/
    ├── agents.py                   # ScopingAgent, ResearchAgent, WritingAgent
    ├── functions.py                # plan_research, conduct_research, write_report
    ├── tools.py                    # wikipedia_search_tool, fetch_webpage_tool
    └── workflows.py                # slack_deep_research workflow + envelope helpers
```

---

## Step 3: Configure environment variables

```bash
cp .env.example .env
```

Open `.env` and add your OpenAI key:

```bash
OPENAI_API_KEY="sk-..."
```

**Why this step matters:** All three agents call `openai/gpt-4o-mini`. One key covers the entire pipeline. `SLACK_BOT_TOKEN` is injected at runtime by the AGNT5 Slack integration. You do not add it here.

---

## Step 4: Set up the Slack App and Studio integration

Follow the [Slack integration guide](/docs/integrations/event-sources/slack.md) to create your Slack app and connect it to AGNT5 in Studio. When configuring the app:

- **Bot token scopes:** `app_mentions:read`, `channels:history`, `chat:write`
- **Event subscriptions:** `app_mention` and `message.channels`

When adding the integration in Studio, you will be prompted to choose three things: the **deployment** that receives the events, the **event type** (app mention or message), and the **target** workflow to trigger. Select the deployment you created, the event types your bot should respond to, and `slack_deep_research` as the target workflow.

<DocsImage
  src="/docs/integrations/slack-integration-target-light.png"
  darkSrc="/docs/integrations/slack-integration-target-dark.png"
  alt="Studio Slack integration configuration panel showing deployment selector, event type selector, and target workflow selector set to slack_deep_research."
  caption="Select the deployment, event type, and target workflow in the Studio integration panel. AGNT5 routes every matching Slack event to the chosen workflow automatically."
  width={1600}
  height={900}
/>

**Why this step matters:** `app_mentions:read` and `message.channels` instruct Slack to deliver events to AGNT5. `chat:write` is the scope `chat.postMessage` uses to post the finished report. The Studio integration handles Slack signature verification automatically. You do not implement that yourself.

Come back here once Studio shows the integration as **Active**.

---

## Step 5: Deploy

```bash
agnt5 deploy
```

The CLI builds a container image from your project directory, pushes it to the AGNT5 registry, and creates a live deployment. Your workflow is now listening for Slack events.

**Why this step matters:** AGNT5 runs your worker as a managed container it can restart, scale, and recover automatically. Deploying is what connects your local code to the Studio integration that receives Slack events.

---

## Step 6: Test the bot

Invite the bot to a channel, then mention it with a research topic:

```
/invite @ResearchBot

@ResearchBot evolution of AI
```

The pipeline starts immediately. A threaded reply arrives in 30–90 seconds depending on how many subtopics the scoping agent generates and how quickly Wikipedia responds.

<DocsImage
  src="/docs/integrations/slack-research.png"
  alt="Slack channel showing the deep-research-bot replying in a thread to 'evolution of AI' with an Executive Summary, Key Findings, and Sources sections."
  caption="The bot posts a structured research report as a threaded reply. The report includes an Executive Summary, Key Findings, and cited Wikipedia sources."
  width={1400}
  height={857}
/>

Open **Studio → Events** to see the incoming event record, and **Studio → Runs** to watch each `ctx.step()` execute in real time.

---

## How it works

### Parsing the Slack envelope

Every Slack event arrives as a nested JSON envelope. The workflow receives it as `**envelope` and passes it through a chain of helpers to extract the fields the pipeline needs:

```python
# workflows.py
def _extract_slack_message(envelope: dict[str, Any]) -> dict[str, str]:
    body = _parse_webhook_body(envelope)
    slack_event = _nested_dict(body, "event")

    # Text lives in different locations for app_mention vs message events
    text = (
        _first_string(slack_event, "text")
        or _first_string(body, "text")
        or _first_string(envelope, "text")
    )

    return {
        "event_type": ...,
        "message":   text,
        "channel":   _first_string(slack_event, "channel") or _first_string(body, "channel"),
        "user":      _first_string(slack_event, "user", "username", "bot_id") or ...,
        "subtype":   _first_string(slack_event, "subtype"),
        "bot_id":    _first_string(slack_event, "bot_id"),
        "thread_ts": _first_string(slack_event, "thread_ts", "ts"),
    }
```

The fallback chain (`slack_event → body → envelope`) exists because the envelope shape differs slightly between `app_mention` and `message` event types. `thread_ts`, the timestamp of the original message, is what makes `chat.postMessage` attach the reply as a thread rather than a new top-level message in the channel.

### The bot loop guard

Before starting the pipeline the workflow checks whether the event should be processed:

```python
# workflows.py
def _should_process(message: dict[str, str]) -> tuple[bool, str]:
    if message.get("bot_id"):
        return False, "bot message"
    if message.get("subtype"):
        return False, f"unsupported subtype: {message['subtype']}"
    if not (message.get("message") or "").strip():
        return False, "empty message"
    return True, ""
```

Without this guard the bot would receive its own `chat.postMessage` delivery as an event, start a new research run on the report text, post another reply, and loop indefinitely. The `bot_id` field is set by Slack on every message sent via a bot token. Checking it is the reliable way to break the cycle.

Events that fail this check return `{"status": "skipped", "reason": "..."}` and appear in Studio → Events with a **Skipped** badge.

### Extracting the research topic

`app_mention` events include the `<@UXXXXXXX>` mention token prepended to the message text. The `_extract_topic` function strips it before passing the topic to the pipeline so agents see clean input:

```python
# workflows.py
def _extract_topic(message: dict[str, str]) -> str:
    text = (message.get("message") or "").strip()
    # Strip <@U12345> prefix that Slack adds to app_mention events
    text = re.sub(r"^<@[A-Z0-9]+>\s*", "", text).strip()
    # Remove an optional leading "research" keyword
    topic = re.sub(r"^research\s+", "", text, flags=re.IGNORECASE).strip()
    return topic or text
```

`"@ResearchBot evolution of AI"` becomes `"quantum computing breakthroughs 2025"` before it reaches `ScopingAgent`.

### The three-stage pipeline

The workflow runs three `ctx.step()` calls in sequence. Each call is a durable checkpoint. Its output is stored in the journal before the next step begins:

```python
# workflows.py
research_plan     = await ctx.step(_plan_research,     topic)
research_findings = await ctx.step(_conduct_research,  topic, research_plan)
report            = await ctx.step(_write_report,      topic, research_plan, research_findings)
```

**Stage 1: ScopingAgent (`plan_research`)**

`ScopingAgent` receives only the raw topic and produces a structured plan with 3–6 subtopics. Its system prompt requires every response to start with `PLAN:`, a structured output contract the function strips before passing downstream:

```python
# functions.py
@function(name="plan_research")
async def _plan_research(ctx: FunctionContext, topic: str) -> str:
    current_date = datetime.now(timezone.utc).strftime("%Y-%m-%d")
    prompt = f"""Today's date is {current_date}.

Research topic: {topic}

Create a structured research plan for this topic:
1. Break it into 3-6 manageable subtopics
2. Define the research strategy for each subtopic
3. Make reasonable assumptions if the topic is vague or ambiguous
4. Start your response with "PLAN:" followed by the structured plan"""

    result = await scoping_agent.run(prompt, context=ctx)
    plan = result.output
    if plan.strip().startswith("PLAN:"):
        plan = plan.replace("PLAN:", "", 1).strip()
    return plan
```

Why plan first? Without an explicit plan, `ResearchAgent` decides what to search for at generation time while simultaneously running tool calls. This leads to uneven subtopic coverage. An upfront plan commits a complete research strategy before any searching begins, and it is stored as a checkpoint so you can inspect it in Studio even if later stages fail.

The agent definition enforces the output format via its system prompt:

```python
# agents.py
scoping_agent = Agent(
    name="ScopingAgent",
    model="openai/gpt-4o-mini",
    instructions="""You are a research scoping specialist who structures research requests into actionable plans.

Your responsibilities:
1. Analyze the research topic and make reasonable assumptions if anything is ambiguous
2. Create a structured research plan with 3-6 subtopics that comprehensively covers the topic
3. If the topic mentions acronyms or specialized terms, include them as subtopics to research

Guidelines for research planning:
- Break complex topics into 3-6 manageable subtopics
- For vague topics, interpret them broadly and cover the most relevant aspects
- Define a clear research strategy for each subtopic
- Prioritize reliable sources (Wikipedia, academic content, official documentation)
- Make reasonable assumptions rather than asking for clarification

Output format:
PLAN:
[Structured research plan with subtopics and strategy]

Always start your response with "PLAN:" followed by the research plan.""",
    max_tokens=8192,
)
```

**Stage 2: ResearchAgent (`conduct_research`)**

`ResearchAgent` receives the topic and the research plan. It has two tools, `wikipedia_search_tool` and `fetch_webpage_tool`, and uses them to gather evidence for each planned subtopic:

```python
# agents.py
research_agent = Agent(
    name="ResearchAgent",
    model="openai/gpt-4o-mini",
    instructions="""You are a systematic research specialist who gathers comprehensive information.

Your responsibilities:
1. Execute research according to the provided research plan
2. Use Wikipedia as the primary source for reliable information
3. Supplement with web searches for additional context when needed
4. Organize findings in a clear, structured format

Research guidelines:
- Start with Wikipedia searches for each subtopic
- Verify information across multiple sources when possible
- Focus on factual information, data, and verifiable details
- Cite sources clearly (include URLs and titles)
- Organize findings by subtopic for easy reference

Output format:
For each subtopic, provide:
## [Subtopic Name]

**Sources:**
- [Source 1: Title and URL]

**Key Findings:**
- [Finding 1]
- [Finding 2]

---
Continue this format for all subtopics.""",
    tools=[wikipedia_search_tool, fetch_webpage_tool],
    max_tokens=8192,
)
```

`wikipedia_search_tool` queries the Wikipedia Search API and handles rate limits with exponential back-off:

```python
# tools.py
@tool(auto_schema=True)
async def wikipedia_search_tool(ctx: Context, query: str, max_results: int = 3) -> str:
    """Search Wikipedia for articles related to the research query.

    Queries the Wikipedia Search API and returns formatted results with
    titles, article URLs, and snippet previews.
    """
    max_retries = 3
    for attempt in range(max_retries):
        response = requests.get(base_url, params=params, headers=headers, timeout=30)

        if response.status_code == 429:
            wait = 2 ** attempt * 3   # 3s → 6s → 12s
            ctx.logger.warning(
                f"Wikipedia rate-limited (429), retrying in {wait}s (attempt {attempt + 1}/{max_retries})"
            )
            await asyncio.sleep(wait)
            continue

        response.raise_for_status()
        # parse and return formatted results ...
```

`fetch_webpage_tool` strips scripts, ads, and navigation elements from HTML using BeautifulSoup, then returns up to 8 000 characters of main content:

```python
# tools.py
@tool(auto_schema=True)
async def fetch_webpage_tool(ctx: Context, url: str) -> str:
    """Fetch and extract text content from a webpage for research purposes.

    Retrieves the HTML page at the given URL, strips non-content elements,
    and returns clean text truncated to 8 000 chars.
    """
    for element in soup(["script", "style", "nav", "footer", "header", "aside", "noscript", "iframe"]):
        element.decompose()

    # Prefer <main> or <article> over the full body
    main_content = (
        soup.find("main")
        or soup.find("article")
        or soup.find(attrs={"class": ["content", "main-content", "article-content"]})
        or soup.find("div", attrs={"id": ["content", "main", "article"]})
        or soup.find("body")
    )

    max_chars = 8000
    if len(text) > max_chars:
        truncated = text[:max_chars]
        last_sentence = truncated.rfind(".")
        if last_sentence > max_chars * 0.8:
            text = truncated[:last_sentence + 1] + " ... [truncated]"
        else:
            text = truncated + " ... [truncated]"
```

8 000 characters is enough for an agent to extract key facts per subtopic without overflowing the context window when multiple pages are fetched across 3–6 subtopics.

**Stage 3: WritingAgent (`write_report`)**

`WritingAgent` receives the topic, plan, and full research findings. Its system prompt enforces Slack's `mrkdwn` formatting dialect because Slack does not render standard CommonMark:

```python
# agents.py
writing_agent = Agent(
    name="WritingAgent",
    model="openai/gpt-4o-mini",
    instructions="""You are a writing specialist who synthesizes research into clear, concise summaries.

Your responsibilities:
1. Transform research findings into a well-structured summary suitable for Slack
2. Keep the report concise — the executive summary and key takeaways are most important
3. Use plain text formatting that renders well in Slack (avoid HTML, use * for bold, use - for bullets)

Report structure:
*[Research Topic]*

*Executive Summary*
[2-3 sentences overview]

*Key Findings*
- [Finding 1]
- [Finding 2]
(up to 5 most important findings)

*Sources*
- [Source Name]: [URL]
(top 3-5 sources)

Keep the total response under 2000 characters so it fits cleanly in a Slack message.
Do NOT use markdown headers (##), HTML, or tables — plain text only.""",
    max_tokens=4096,
)
```

The 2 000-character limit is deliberate: Slack truncates messages at 4 000 characters and renders better when messages are concise. The agent is instructed to use `*bold*` (Slack `mrkdwn`) not `**bold**` (CommonMark) and `-` for bullets rather than `*` to avoid ambiguity with bold markers.

### Posting back to Slack

The report is posted with `chat.postMessage`. Passing `thread_ts` threads the reply under the original mention:

```python
# workflows.py
async def _post_to_slack(
    ctx: WorkflowContext,
    channel: str,
    text: str,
    thread_ts: str | None,
    bot_token: str,
) -> dict[str, Any]:
    payload: dict[str, Any] = {"channel": channel, "text": text}
    if thread_ts:
        payload["thread_ts"] = thread_ts   # attaches as a thread reply

    try:
        async with httpx.AsyncClient(timeout=15) as client:
            response = await client.post(
                "https://slack.com/api/chat.postMessage",
                headers={
                    "Authorization": f"Bearer {bot_token}",
                    "Content-Type": "application/json; charset=utf-8",
                },
                json=payload,
            )
    except httpx.HTTPError as exc:
        ctx.logger.warning(f"Slack post failed: {exc}")
        return {"posted": False, "reason": exc.__class__.__name__}

    try:
        data = response.json()
    except ValueError:
        data = {"ok": False, "error": response.text[:200]}

    if response.status_code >= 400 or not data.get("ok"):
        reason = str(data.get("error") or response.status_code)
        ctx.logger.warning(f"Slack post failed: {reason}")
        return {"posted": False, "reason": reason}

    return {"posted": True, "ts": data.get("ts", "")}
```

`thread_ts` is the `ts` of the original message. If it is absent (e.g., the event came from a DM with no prior message) the reply is posted as a new top-level message. The step returns `{"posted": true, "ts": "..."}` on success, visible in the Studio run output so you can confirm delivery without opening Slack.

### Checkpointing in practice

What happens if the worker pod is restarted after `conduct_research` completes but before `write_report` starts?

```
Journal state at restart:
  plan_research     ✓  output stored
  conduct_research  ✓  output stored
  write_report      ✗  not yet recorded

Workflow replay:
  plan_research     → journal hit → return stored plan   (< 1ms, no LLM call)
  conduct_research  → journal hit → return stored findings (< 1ms, no Wikipedia requests)
  write_report      → not in journal → execute normally
```

- No Wikipedia requests are retried.
- No scoping or research LLM calls are repeated.
- The run resumes exactly where it left off.
- You can verify this in Studio → Runs: replayed steps show near-zero duration, while the re-executed step shows real latency.

---

## Debug in Studio

### Viewing incoming events

Open **Studio → Events**. Every Slack webhook delivery appears as a row with a timestamp, event name (`slack.app_mention`, `slack.message`), and a status badge.

Two badge states to know:

- **Processing:** `_should_process` returned `True`. A workflow run was started.
- **Skipped:** `_should_process` returned `False` (bot message, empty text, or unsupported subtype). The workflow exited without running the pipeline.

<DocsImage
  src="/docs/integrations/slack-events-light.png"
  darkSrc="/docs/integrations/slack-events-dark.png"
  alt="Studio Events view showing a list of incoming Slack webhook deliveries with timestamps, event names, and Processing or Skipped status badges."
  caption="Studio → Events lists every Slack webhook delivery. Click any row to inspect the raw payload and follow the link to the triggered workflow run."
  width={1600}
  height={900}
/>

If you mention the bot and no Slack reply arrives, start here. A **Skipped** badge means the event was received but filtered before the pipeline could start. Open the payload pane to see which condition was triggered.

### Inspecting the event payload

Click any event row to open the detail pane. The fields map directly to what your workflow code receives as `**envelope`:

| Pane field | Where it comes from | What your code does with it |
|---|---|---|
| `source` | Slack integration config | Identifies the event origin as Slack vs HTTP or another source |
| `event` | `body.event.type` | Determines which trigger fired (`slack.app_mention` or `slack.message`) |
| `environment_ref` | Deployment environment | Confirms which deployment handled this event |
| `workflow` | Matched trigger rule | The workflow that started (or was skipped) for this event |
| `channel` | `body.event.channel` | Passed to `chat.postMessage` as the reply destination |
| `thread_ts` | `body.event.ts` | Passed to `chat.postMessage` to attach the reply as a thread |
| `bot_id` | `body.event.bot_id` | If non-empty, `_should_process` returns `False` (bot loop guard) |
| `text` | `body.event.text` | The raw message text including the `<@UXXXXXXX>` mention prefix |

The raw payload is exactly what `_extract_slack_message(envelope)` receives. If the extracted `channel` or `thread_ts` looks wrong in a run's output, compare against the payload pane. That is the authoritative source.

### Tracing event → run → steps

<DocsImage
  src="/docs/integrations/slack-runs-light.png"
  darkSrc="/docs/integrations/slack-runs-dark.png"
  alt="Studio Runs view showing the slack_deep_research workflow run with plan_research, conduct_research, and write_report steps, each displaying input, output, and duration."
  caption="Studio → Runs shows each ctx.step() checkpoint with its input, stored output, and execution duration — no extra logging needed."
  width={1600}
  height={900}
/>

Click a **Processing** event and follow the link to the Run. In the run view you see each `ctx.step()` with:

- **Input:** the arguments passed to the step function (topic string, plan text, research findings)
- **Output:** the return value stored in the journal (plan, findings, formatted report)
- **Duration:** how long each step took, including tool calls inside agent runs

This is the full observability loop: Slack message → event (Events view) → run (Runs view) → step-by-step trace (step outputs). No logging infrastructure required. AGNT5 captures all of it automatically from `ctx.step()` calls.

### Common debugging scenarios

| What you see in Studio | What it means | Where to look next |
|---|---|---|
| Event badge: **Skipped** | `_should_process` returned `False` | Payload pane: check `bot_id`, `subtype`, or empty `text` field |
| Event badge: **Processing**, no Slack reply | Workflow ran but `chat.postMessage` failed | Run view, final step output: check `slack_post.posted` and `slack_post.reason` |
| No event appears at all | Slack is not delivering to AGNT5 | Studio Integrations: verify the event subscription URL and confirm the app is installed in the target channel |
| `conduct_research` step shows sparse findings | Wikipedia returned few results for a subtopic | Check `plan_research` output. Subtopics may be too narrow or phrased as questions rather than search terms |
| `write_report` output is cut short | `WritingAgent` hit the 2 000-character limit | Expected. See "Extend it" below for splitting the report across multiple messages |
| Replayed step shows near-zero duration | That step was already checkpointed before restart | Normal. This confirms checkpointing is working. The result came from the journal |

---

## Running environments

The bot works across three environments: local, staging, and production. Each is covered in the docs:

- [Local development](/docs/build/local-development.md): use `agnt5 dev` to run the worker locally with file-watch reload. Trigger the workflow from Studio or the CLI without a real Slack event.
- [Environments](/docs/run/environments.md): deploy to preview, staging, or production with `agnt5 deploy --env <name>`. Promote a verified deployment forward rather than rebuilding, and scope `SLACK_BOT_TOKEN` per environment so staging posts to a test channel and production posts to the real one.
- [Deploying](/docs/run/deploying.md): covers the full deploy-verify-observe loop, including deployment logs, run listing, trace inspection, and promoting to production from Studio.

---

## Extend it

### Add web search alongside Wikipedia

`ResearchAgent` currently searches Wikipedia only. Add a web search tool for broader coverage:

```python
# tools.py
import httpx
from agnt5 import Context, tool

@tool(auto_schema=True)
async def web_search_tool(ctx: Context, query: str) -> str:
    """Search the web using a search API for broader coverage beyond Wikipedia."""
    async with httpx.AsyncClient(timeout=15) as client:
        response = await client.get(
            "https://api.yoursearchprovider.com/search",
            params={"q": query, "key": os.getenv("SEARCH_API_KEY")},
        )
    results = response.json().get("items", [])
    return "\n---\n".join(
        f"Title: {r['title']}\nURL: {r['link']}\nSnippet: {r['snippet']}"
        for r in results[:5]
    )
```

Register it on `ResearchAgent` and the `Worker`:

```python
# agents.py
research_agent = Agent(
    ...
    tools=[wikipedia_search_tool, fetch_webpage_tool, web_search_tool],
)

# app.py
worker = Worker(
    ...
    tools=[fetch_webpage_tool, wikipedia_search_tool, web_search_tool],
)
```

### Multi-turn conversation across thread replies

By default each mention starts a fresh pipeline with no memory of prior exchanges. To carry context across follow-up messages in the same Slack thread, store the report keyed by `thread_ts` in workflow state:

```python
# workflows.py — inside slack_deep_research
thread_ts = message.get("thread_ts")

# Load any prior research context for this thread
prior_context = await ctx.state.get(f"thread:{thread_ts}") or ""

research_plan = await ctx.step(_plan_research, topic, prior_context)
# ... rest of pipeline ...

# Persist the new report for the next message in this thread
await ctx.state.set(f"thread:{thread_ts}", report)
```

Update `_plan_research` to accept and include the prior context:

```python
# functions.py
@function(name="plan_research")
async def _plan_research(ctx: FunctionContext, topic: str, prior_context: str = "") -> str:
    context_block = (
        f"\nPrior research from this Slack thread:\n{prior_context}\n"
        if prior_context else ""
    )
    prompt = f"""Today's date is {current_date}.{context_block}

Research topic: {topic}
..."""
```

### Post longer reports as multiple messages

If the synthesised report exceeds Slack's rendering limit, split it across messages in the same thread:

```python
# workflows.py
MAX_CHARS = 3800   # leave headroom for Slack's internal overhead

async def _post_report_in_chunks(ctx, channel, report, thread_ts, bot_token):
    chunks = [report[i : i + MAX_CHARS] for i in range(0, len(report), MAX_CHARS)]
    for i, chunk in enumerate(chunks):
        label = f"*(Part {i + 1}/{len(chunks)})*\n" if len(chunks) > 1 else ""
        await _post_to_slack(ctx, channel, label + chunk, thread_ts, bot_token)
```

---

## What you learned

- **Fire-and-forget webhooks:** acknowledging the Slack event immediately and running the pipeline asynchronously is what makes slow AI tasks viable in Slack bots. This pattern applies any time you have a fast inbound trigger and a slow outbound result.
- **`ctx.step()` as a reliability primitive:** every step is a checkpoint. Worker restarts, container evictions, and redeployments do not lose in-progress work. Think of step boundaries as your pipeline's save points.
- **Multi-agent decomposition:** splitting scope / research / write into three agents gives you separate context windows, independent testability, and clean tool isolation. The same pattern applies to any pipeline that requires distinct reasoning modes in sequence.
- **Studio Events as a first-class debugging tool:** incoming event payloads, processing status, and step-level traces are all visible in Studio with no extra instrumentation. Most bot issues (wrong events, skipped messages, failed posts) can be diagnosed there before touching a log file.

---

## Next steps

- [Slack integration reference](/docs/integrations/event-sources/slack.md): full envelope structure, all event types, signature verification, and delivery semantics.
- [Build a deep research agent](/cookbooks/deep-research-agent.md): the same three-agent pipeline without the Slack trigger, useful as a standalone reference.
- [Build a durable research agent with approval and recovery](/cookbooks/durable-research-agent-approval-recovery.md): add a human approval gate before the report is posted.
- [Debug AI workflows with traces, not scattered logs](/cookbooks/workflow-native-observability.md): a deeper look at step-level inputs, outputs, durations, and model calls in the Studio trace view.
