Getting started with MCP servers

02 Sep 2025 - tsp
Last update 02 Sep 2025
Reading time 23 mins

In the world of large language models (LLMs) and AI assistants, the Model Context Protocol (MCP) has emerged as a modular, standardized method for exposing tools, structured prompts, and resources to LLMs. Over the past months, it has become the de facto standard for extending capabilities across nearly all major LLM agent frameworks and frontends, including OpenAI’s ChatGPT, Anthropic’s Claude, and integrations like Kilo Code in Cursor and VSCode.

This article introduces the MCP protocol, explains how it differs from traditional function calling, and walks through its architecture, lifecycle, and transport mechanisms. Later, we will implement a simple MCP server using Python and the FastMCP SDK to expose an MQTT interface to LLM agents.

It is important to emphasize that MCP does not invent anything new. All tasks that MCP can solve can already be solved with function calling, simple JSON documents and file or HTTP resources. But it provides a standardized and modular way to establish loose dynamic coupling and resource injection into orchestrator frameworks.

From Function Calling to Model Context Protocol

Originally, LLMs could interact with external systems using function calling: the orchestrator framework would inspect the current user intent or agent goal, collect the currently relevant tools from its internal registry (or a custom dynamic mechanism), serialize their definitions into OpenAI-compatible JSON, and pass them to the LLM along with the full conversation history in a stateless API call. When the model responds with a tool_call, it means it has processed the input and is requesting a specific method execution. The orchestrator then executes the tool, adds the result to the message stream, and initiates a new inference call. This loop enables dynamic and modular behavior, but the registry and discovery of tools remains framework-specific and not standardized across different agents or environments. This approach, while flexible within a custom orchestrator, still comes with practical limitations in typical usage:

MCP addresses these issues by offering a standardized way for external components to declare and serve tools, prompts, and resources, enabling agent frameworks to discover and use them with minimal custom integration.

It is important to note that MCP does not change how stateless LLM inference APIs like OpenAI’s chat/completions work - each call is still a one-shot inference using the full message history and tool definitions passed at that moment. The difference lies in how the agent orchestrator behaves between those calls.

If you are building your own orchestration loop, you can fully implement dynamic behavior even with standard function calling: just maintain an internal registry of available tools and pass a different tools array on each iteration based on logic, user state, or context. This works perfectly well for custom, tightly coupled pipelines.

What MCP offers is a standardized interface for discovering tools, prompts, and resources from third-party services or independently authored modules. Instead of managing internal configuration or hardcoding tool schemas, your agent can query one or more MCP servers (via stdio or HTTP), retrieve metadata, and invoke tools in a modular and loosely coupled fashion.

This makes MCP ideal for plugin-like systems, distributed agents, or any setup where the components evolve independently but need a shared protocol for coordination.

Core Components of MCP

At its core, the MCP defines a few key concepts:

Tools (Methods, Functions)

Tools are executable methods that the LLM can call. They are described in OpenAI-compatible JSON schema (similar to function calling) and provide input/output specifications.

Examples:

These tools can be exposed by the MCP server dynamically and invoked over the wire. If this sounds like function calling then yes - it is the same. The difference is that MCP has specified the format of function declarations (the JSON schema) as well as the transport over which one exchanges those methods. The orchestrator still fetches the list of relevant methods that one wants to use, passes them to the tools array of the LLM from where they get passed into the chat template - and executes LLM inference exactly the same way as for traditional function calling. When the orchestrator receives a tool_call response he executes the method by doing an RPC call through the transport that has been used (a network request or passing the request to an external process). The idea is exactly the same.

Prompts (Templates)

Prompts are modular prompt fragments or full templates that the orchestrator can dynamically retrieve and inject into the LLM’s context. This mechanism allows external components to provide specialized behavior, instructions, or personality traits to LLMs in a structured way - especially useful for sub-task handling, formatting conventions, or chain-of-thought scaffolding.

Prompts exposed via MCP each have a name, title, description, and content field. They can be selected either by the LLM itself (from a known list) or injected by the orchestrator based on configuration, the current context, model responses, or user intent.

Once selected, the prompt is fetched from the MCP server and incorporated into the LLM context - typically in one of the following ways:

From the model’s perspective, it sees the prompt as ordinary text input. There is no special API-level difference - the value lies in the modularity and flexibility of where the content comes from. This design offers several advantages:

There is a standardized protocol for discovery and injection, eliminating the need for bespoke JSON formats or incompatible APIs between frameworks

Resources

Resources are structured pieces of non-executable content - such as documents, configuration files, or knowledge snippets - that can be discovered, retrieved, and injected into an LLMs context.

Examples:

While such data could also be retrieved via a custom HTTP API or internal agent logic, MCP resources provide a standardized discovery and delivery interface. Each resource includes metadata (name, description, MIME type, path) and can be listed, previewed, and retrieved through the same protocol as other MCP elements.

Why not just use HTTP?

You could expose your documentation or database snapshots via HTTP endpoints. But then you’d need to implement:

MCP solves this by offering a unified interface that:

How are resources used by an orchestrator?

The orchestrator can list the available resources exposed by MCP servers and decide (based on model requests, current task, or configuration) which ones to load. The content can then be injected:

In many cases, a resource may represent a dynamic wrapper - for example:

This makes the MCP server a standardized proxy to external knowledge systems, giving agents the ability to explore and use data on demand - without tightly coupling the orchestrator to each specific backend implementation.

Lifecycle of an MCP Server

A typical MCP server follows this lifecycle:

Transport Mechanisms

MCP supports three major transport methods:

Standard I/O (stdin/stdout)

Streamable HTTP

Session Handling in the FastMCP SDK

The FastMCP SDK uses a session-based access model to manage streamable HTTP endpoints securely and contextually:

This mechanism allows FastMCP to isolate agents, apply per-session filtering, and potentially enforce authentication and rate limits - without relying on external reverse proxies or middleware.

Server-Sent Events (SSE)

While stdin/stdout is fast and simple for single-process local integration, only the HTTP-based transports (streamable HTTP and SSE) support true multi-agent sharing, persistent availability, and network-based security models. These are essential for distributed architectures and plugin-based ecosystems.

Typical applications

MCP is especially well-suited for:

Actual (hopefully useful) implementations

In the following section we are taking a look at some small scale MCP server implementations.

The toy MCP

To demystify MCPs, here is a minimal server you can paste into toy_mcp.py and execute it. It exposes a single tool, now(), returning the current ISO timestamp, plus a tiny read-only resource.

from datetime import datetime, timezone
from FastMCP import FastMCP, Context

mcp = FastMCP("toy-mcp")

@mcp.tool(annotations={
    "title": "Return the current time (UTC).",
    "readOnlyHint": True,
    "destructiveHint": False,
})
def now(ctx: Context = None) -> str:
    return datetime.now(timezone.utc).isoformat()

@mcp.resource("toy://hello")
def hello() -> str:
    return "Hello from Toy MCP! Try the `now` tool."

if __name__ == "__main__":
    mcp.run()  # stdio transport; launchable by your orchestrator

You can utilize this in your orchestrator by configuring the quasi-standardized mcp.json configuration file - one has to look up where this is located for your specific orchestrator. For KiloCode, for example, you can either store the settings in the global mcp.json or relative to your current project folder at .kilocode/mcp.json.

{
  "mcpServers": {
    "toy": {
      "command": "python3",
      "args": ["/home/exampleuser/toy_mcp.py"],
      "alwaysAllow": [
         "now"
      ]
    }
  }
}

Interacting with MQTT via an MCP Server

Now lets do something useful - let’s make MQTT a first-class citizen for LLM agent orchestrators. MQTT is the lingua franca of devices, labs, and home automation. Exposing it through MCP lets an agent safely discover topics, subscribe to live data, and publish commands - all within the same agent workflow (either interactive through a chat session or via an background agent). A few concrete things this unlocks:

In short: the MQTT MCP gives your agent eyes (subscribe), hands (publish) and a voice for structured conversations with devices and services (request/response) - without leaving the chat or agent workflow or having to program in the traditional way.

The implementation

The implementation of this MCP is a little bit more complex and can be found on GitHub and can also be installed from PyPi via

pip install mcpMQTT

The stdio protocol based MCP server is configured via a single configuration file at ~/.config/mcpmqtt/config.json or at a configurable location specified via the --config parameter. An example configuration file looks like

{
  "mqtt": {
    "host": "localhost",
    "port": 1883,
    "username": null,
    "password": null,
    "keepalive": 60
  },
  "topics": [
    {
      "pattern": "sensors/+/temperature",
      "permissions": ["read"],
      "description": "Temperature sensor data from any location (+ matches single level like 'room1', 'room2'. Known rooms are 'exampleroom1' and 'exampleroom2'). Use subscribe, not read on this topic. Never publish."
    },
    {
      "pattern": "sensors/+/humidity",
      "permissions": ["read"],
      "description": "Humidity sensor data from any location. (+ matches single level like 'room1', 'room2'. Known rooms are 'exampleroom1' and 'exampleroom2'). Use subscribe, not read on this topic. Never publish. Data returned as %RH"
    },
    {
      "pattern": "actuators/#",
      "permissions": ["write"],
      "description": "All actuator control topics (# matches multiple levels like 'lights/room1'. To enable a light you write any payload to 'lights/room1/on', to disable you write to 'lights/room1/off')"
    },
    {
      "pattern": "status/system",
      "permissions": ["read"],
      "description": "System status information - exact topic match"
    },
    {
      "pattern": "commands/+/request",
      "permissions": ["write"],
      "description": "Command request topics for request/response patterns"
    },
    {
      "pattern": "commands/+/response",
      "permissions": ["read"],
      "description": "Command response topics for request/response patterns"
    }
  ],
  "logging": {
    "level": "INFO",
    "logfile": null
  }
}

The sections in the main JSON object are:

The topic permissions allow one to perform filtering of which topics and be subscribes or published to by the agent orchestrator - this works in addition to the message broker configuration.

The tools provided are

In addition the MCP server exposes two resources. First the mcpmqtt://topics/allowed resource provides a list of all useable topics and their permissions. In addition mcpmqtt://topics/examples also provides examples to the agent orchestrators.

Keep in mind that the MCP grants whatever the broker permits and that is whitelisted in configuration - lock it down at the broker and in the MCP config properly or you will encounter interesting situations.

Why This Design Works Well for Agents

The design makes the MQTT space self-describing to an agent. Instead of handing the model a static set of methods, the server exposes resources that enumerate what’s actually allowed at runtime. mcpmqtt://topics/allowed returns concrete patterns, permissions, and broker hints (host/port; whether auth is required), and you can also serve example expansions so the agent sees how + and # wildcards materialize into real topics. Under the hood this is populated straight from the live config via a small global that the resource reads, so discoverability stays in sync with whatever you’ve deployed.

Safety comes from checking the rules before touching the wire. Every tool - publish, subscribe, one-shot read and request/response - verifies the requested topic against the permission set (read and/or write) and only proceeds if it matches one of the configured patterns. Because the same wildcard semantics used by MQTT are enforced in the MCP layer, an agent can’t accidentally publish to or subscribe from a disallowed branch even if it guesses a valid-looking string or wants to exploit your architecture in a way that you do not desire; the tool simply refuses and returns a clear error. This pushes guardrails to the controllable edge, where they belong, and keeps the agent’s often nondeterministic behavior constrained to the intended slice of the broker. This prevents bad surprises from hallucinations or a run away context.

The async story is robust. Paho delivers messages from its own thread; the client manager records ā€œwho’s waiting for whatā€ in a dictionary of asyncio.Futures and when a message arrives it completes the right future on the correct event loop using loop.call_soon_threadsafe(...) - the thread-safe handoff that avoids racey cross-thread mutations. Waiting is bounded (asyncio.wait_for) with cleanup that removes stale futures, and the RPC helper arms the response listener before publishing the request to prevent missed replies. The MCP server also wraps connection setup/teardown in a lifespan context so tools only run with a live broker and shut down cleanly. The result is an agent interface that’s non-blocking, race-aware, and predictable under timeouts.

A mini code tour

mcp_server.py

The server is created again via the FastMCP constructor. In contrast to the simple example above a lifespan context manager is also supplied that loads the configuration, constructs the MQTT client, yields a context, and cleans up on shutdown. Topic permissions are validated before any network action with validate_topic_permission(...) against your configurations patterns. A global _current_config lets other parts of the application read the active configuration to render discoverable topic info.

The tools are again exposed via the @mcp.tool annotation - but in contrast to our simple example the functions carry additional annotations - a title for each tool as well as the following hints:

Resources are annotated with @mcp.resource and get an mcpmqtt:// URL style prefix.

mqtt_client.py

This contains the MQTT client manager. It builds on paho-mqtt with a thin async layer. It manages connection state and performs all actions exposed to the orchestrator. A pending_responses dictionary maps to the asyncio Futures that are triggered whenever an operation finishes in the paho-mqtt thread. This allows signalling the finished operations into the Futures asyncio loop (a different thread) in a thread safe way. The wait_for_message method utilizes this mechanism by awaiting on the Future that represent it’s request.

Conclusion

MCP provides a standardized, modular way to expose tools, resources and prompts to orchestrators - dynamically, at runtime, and without baking assumptions into any single agent framework. By separating capability description from transport and execution, it gives you discoverability, permission boundaries, and composability out of the box. Writing an MCP server is intentionally simple: the toy MCP shows that a few clear tool definitions and a tiny event loop are enough to get a real, inspectable capability surface that any MCP-aware orchestrator can use.

Building on that, the shown example MQTT MCP provides a clean interface to controlled interaction with the real world - IoT devices, home automation, lab gear and robots. Topics are discoverable yet fenced by allowlists; publish/subscribe and request/reply are guarded by timeouts and correlation; async hand-off keeps the client robust under load. The result is a portable, permissioned ā€œhands-and-eyesā€ layer that scales from a breadboard sensor to a building automation system while remaining easy to reason about and safe to operate.

This article is tagged:


Data protection policy

Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)

This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/

Valid HTML 4.01 Strict Powered by FreeBSD IPv6 support