Getting started with MCP servers

02 Sep 2025 - tsp
Last update 10 Oct 2025
Reading time 23 mins

In the world of large language models (LLMs) and AI assistants, the Model Context Protocol (MCP) has emerged as a modular, standardized method for exposing tools, structured prompts, and resources to LLMs. Over the past months, it has become the de facto standard for extending capabilities across nearly all major LLM agent frameworks and frontends, including OpenAI’s ChatGPT, Anthropic’s Claude, and integrations like Kilo Code in Cursor and VSCode.

This article introduces the MCP protocol, explains how it differs from traditional function calling, and walks through its architecture, lifecycle, and transport mechanisms. Later, we will implement a simple MCP server using Python and the FastMCP SDK to expose an MQTT interface to LLM agents.

It is important to emphasize that MCP does not invent anything new. All tasks that MCP can solve can already be solved with function calling, simple JSON documents and file or HTTP resources. But it provides a standardized and modular way to establish loose dynamic coupling and resource injection into orchestrator frameworks.

From Function Calling to Model Context Protocol
Core Components of MCP
Lifecycle of an MCP Server
Transport Mechanisms
Typical applications
Actual (hopefully useful) implementations
- The toy MCP
- Interacting with MQTT via an MCP Server
Conclusion

From Function Calling to Model Context Protocol

Originally, LLMs could interact with external systems using function calling: the orchestrator framework would inspect the current user intent or agent goal, collect the currently relevant tools from its internal registry (or a custom dynamic mechanism), serialize their definitions into OpenAI-compatible JSON, and pass them to the LLM along with the full conversation history in a stateless API call. When the model responds with a tool_call, it means it has processed the input and is requesting a specific method execution. The orchestrator then executes the tool, adds the result to the message stream, and initiates a new inference call. This loop enables dynamic and modular behavior, but the registry and discovery of tools remains framework-specific and not standardized across different agents or environments. This approach, while flexible within a custom orchestrator, still comes with practical limitations in typical usage:

While tools can be built dynamically, this requires implementation-specific logic and is not standardized.
Discovery and invocation of tools across systems or services must be manually integrated, often tightly coupled to the orchestrator’s architecture.
Remote execution or tool delegation is possible, but again, it requires bespoke communication layers (e.g., message queues or custom APIs) without a shared protocol.

MCP addresses these issues by offering a standardized way for external components to declare and serve tools, prompts, and resources, enabling agent frameworks to discover and use them with minimal custom integration.

It is important to note that MCP does not change how stateless LLM inference APIs like OpenAI’s chat/completions work - each call is still a one-shot inference using the full message history and tool definitions passed at that moment. The difference lies in how the agent orchestrator behaves between those calls.

If you are building your own orchestration loop, you can fully implement dynamic behavior even with standard function calling: just maintain an internal registry of available tools and pass a different tools array on each iteration based on logic, user state, or context. This works perfectly well for custom, tightly coupled pipelines.

What MCP offers is a standardized interface for discovering tools, prompts, and resources from third-party services or independently authored modules. Instead of managing internal configuration or hardcoding tool schemas, your agent can query one or more MCP servers (via stdio or HTTP), retrieve metadata, and invoke tools in a modular and loosely coupled fashion.

This makes MCP ideal for plugin-like systems, distributed agents, or any setup where the components evolve independently but need a shared protocol for coordination.

Core Components of MCP

At its core, the MCP defines a few key concepts:

Tools (Methods, Functions)

Tools are executable methods that the LLM can call. They are described in OpenAI-compatible JSON schema (similar to function calling) and provide input/output specifications.

Examples:

search(query: str)
fetch_weather(location: str)
publish(topic: str, payload: str)

These tools can be exposed by the MCP server dynamically and invoked over the wire. If this sounds like function calling then yes - it is the same. The difference is that MCP has specified the format of function declarations (the JSON schema) as well as the transport over which one exchanges those methods. The orchestrator still fetches the list of relevant methods that one wants to use, passes them to the tools array of the LLM from where they get passed into the chat template - and executes LLM inference exactly the same way as for traditional function calling. When the orchestrator receives a tool_call response he executes the method by doing an RPC call through the transport that has been used (a network request or passing the request to an external process). The idea is exactly the same.

Prompts (Templates)

Prompts are modular prompt fragments or full templates that the orchestrator can dynamically retrieve and inject into the LLM’s context. This mechanism allows external components to provide specialized behavior, instructions, or personality traits to LLMs in a structured way - especially useful for sub-task handling, formatting conventions, or chain-of-thought scaffolding.

Prompts exposed via MCP each have a name, title, description, and content field. They can be selected either by the LLM itself (from a known list) or injected by the orchestrator based on configuration, the current context, model responses, or user intent.

Once selected, the prompt is fetched from the MCP server and incorporated into the LLM context - typically in one of the following ways:

As a system message when initiating a subcontext
As part of the running conversation history
Appended or prepended to a user query

From the model’s perspective, it sees the prompt as ordinary text input. There is no special API-level difference - the value lies in the modularity and flexibility of where the content comes from. This design offers several advantages:

Avoids hardcoding prompt templates in the agent codebase
Prompts can be versioned, reused, and maintained independently
Multiple agents can share prompt libraries
Prompts can be authored and managed by non-developers

There is a standardized protocol for discovery and injection, eliminating the need for bespoke JSON formats or incompatible APIs between frameworks

Resources

Resources are structured pieces of non-executable content - such as documents, configuration files, or knowledge snippets - that can be discovered, retrieved, and injected into an LLMs context.

Examples:

Markdown files with API documentation
PDF manuals or changelogs
JSON or YAML configuration data
Graph or vector database summaries

While such data could also be retrieved via a custom HTTP API or internal agent logic, MCP resources provide a standardized discovery and delivery interface. Each resource includes metadata (name, description, MIME type, path) and can be listed, previewed, and retrieved through the same protocol as other MCP elements.

Why not just use HTTP?

You could expose your documentation or database snapshots via HTTP endpoints. But then you’d need to implement:

A discovery layer (what files exist?)
MIME type inference
Metadata schema
Access control or filtering
Compatibility logic for multiple agents

MCP solves this by offering a unified interface that:

Makes resources self-describing
Allows structured querying across different MCP servers
Integrates into the same transport (pipe or HTTP/SSE)
Allows agents to dynamically discover and load context-relevant documents without hardcoding logic

How are resources used by an orchestrator?

The orchestrator can list the available resources exposed by MCP servers and decide (based on model requests, current task, or configuration) which ones to load. The content can then be injected:

As a system or user message
As part of few-shot context
Or displayed to the user to let the model comment or reason about it

In many cases, a resource may represent a dynamic wrapper - for example:

A database-backed MCP resource could stream structured results from a SQL or graph query
A vector database interface could expose document chunks with embeddings as selectable resources

This makes the MCP server a standardized proxy to external knowledge systems, giving agents the ability to explore and use data on demand - without tightly coupling the orchestrator to each specific backend implementation.

Lifecycle of an MCP Server

A typical MCP server follows this lifecycle:

Startup and Declaration: On launch, the server declares which tools, prompts, and resources it provides. While some implementations (such as those using FastMCP) use a manifest.json-like structure, this is not mandated by the protocol itself - the format and mechanism are implementation-specific and may vary depending on the server architecture or underlying framework.
Transport Initialization: The server waits for incoming connections, which can come through stdin/stdout pipes (if run as a subprocess) or over HTTP/SSE (for multi-agent deployments).
Discovery and Listing: When queried, the server returns lists of available tools, prompts, and resources, each with descriptions and schema (for tools).
Invocation: Agents send requests to invoke tools, retrieve prompts, or load resources. These can happen multiple times over the connection.
Termination or Keep-Alive: The server continues running as a background service or subprocess, responding to further queries until terminated.

Transport Mechanisms

MCP supports three major transport methods:

Standard I/O (stdin/stdout)

Suitable for launching the MCP server as a subprocess.
Enables fast local communication between a single agent framework and the MCP process.
Cannot be shared across multiple agents; each orchestrator needs its own subprocess instance.
Commonly used in local tools like Kilo Code to wrap a Python script or binary in MCP format.

Streamable HTTP

Stateless interaction via HTTP POST.
Each request/response cycle is standalone, making it ideal for cloud-hosted or REST-integrated MCP services.
Supports concurrent requests from multiple agent orchestrators.
Easily integrated into load-balanced environments or secured with HTTP headers, API tokens, or mTLS.
Session handling is orchestrator-defined; typical patterns include passing agent identifiers in headers or query parameter

Session Handling in the FastMCP SDK

The FastMCP SDK uses a session-based access model to manage streamable HTTP endpoints securely and contextually:

Before any tool, prompt, or resource can be accessed, the client must call the /register endpoint to establish a session.
The server responds with a token or cookie identifying the session.
Subsequent requests must include this session token (e.g., via an Authorization or X-Session header).
If no valid session is provided, requests to endpoints like /invoke or /stream will result in 400 Bad Request or 401 Unauthorized responses.

This mechanism allows FastMCP to isolate agents, apply per-session filtering, and potentially enforce authentication and rate limits - without relying on external reverse proxies or middleware.

Server-Sent Events (SSE)

Persistent connection allowing real-time push-style communication.
Suitable for streaming outputs, monitoring subscriptions, or feeding the model incrementally.
Multiple clients can connect concurrently, with the server maintaining separate channels for each stream.
Supports authentication and authorization via HTTP headers (e.g. tokens) at connection initiation.
Can be combined with structured session registration (e.g. via initial payload or handshake) for access control and audit logging.

While stdin/stdout is fast and simple for single-process local integration, only the HTTP-based transports (streamable HTTP and SSE) support true multi-agent sharing, persistent availability, and network-based security models. These are essential for distributed architectures and plugin-based ecosystems.

Typical applications

MCP is especially well-suited for:

Plugin-style injection: Tools, prompts, and resources can be registered and served by independent components, allowing agents to integrate third-party capabilities without modifying core logic. This enables true plugin architectures.
Remote agents and device control: MCP servers can run on other machines, embedded systems, or gateways - allowing LLMs to interact with remote databases, information repositories, long running tasks, lab equipment, home automation, industrial machinery, or sensor networks in a modular way.
Dynamic capability discovery: An orchestrator or LLM can first reason about the problem at hand (e.g., using semantic search or graph traversal), then query MCP servers to assemble just the right subset of tools needed for that task, reducing overload and improving contextual relevance.
Shared multi-agent services: MCP servers can be reused across multiple orchestrators or sessions, enabling central services to be shared and scaled cleanly.
Standardized integration for context enrichment: Prompts and resources can be retrieved from MCP servers on demand, supporting context construction workflows that evolve during a session, rather than being statically defined up front.

Actual (hopefully useful) implementations

In the following section we are taking a look at some small scale MCP server implementations.

At first we give a quick glance at the typical example - a simple adding routine that you can call from your LLM orchestrator including examples on how to use it from the Kilo-Code plugin in Cursor or Code-OSS (VSCode)
Then we are turning towards a very useful extension for many agent pipelines - and MQTT MCP that allows the LLM to access arbitrary MQTT topics to interact with different microservices.

The toy MCP

To demystify MCPs, here is a minimal server you can paste into toy_mcp.py and execute it. It exposes a single tool, now(), returning the current ISO timestamp, plus a tiny read-only resource.

from datetime import datetime, timezone
from FastMCP import FastMCP, Context

mcp = FastMCP("toy-mcp")

@mcp.tool(annotations={
    "title": "Return the current time (UTC).",
    "readOnlyHint": True,
    "destructiveHint": False,
})
def now(ctx: Context = None) -> str:
    return datetime.now(timezone.utc).isoformat()

@mcp.resource("toy://hello")
def hello() -> str:
    return "Hello from Toy MCP! Try the `now` tool."

if __name__ == "__main__":
    mcp.run()  # stdio transport; launchable by your orchestrator

You can utilize this in your orchestrator by configuring the quasi-standardized mcp.json configuration file - one has to look up where this is located for your specific orchestrator. For KiloCode, for example, you can either store the settings in the global mcp.json or relative to your current project folder at .kilocode/mcp.json.

{
  "mcpServers": {
    "toy": {
      "command": "python3",
      "args": ["/home/exampleuser/toy_mcp.py"],
      "alwaysAllow": [
         "now"
      ]
    }
  }
}

When utilizing an orchestrator like codex you have to use their proprietary YAML based format:

[mcp_servers.toy]
command = "python3"
args = [ "/home/exampleuser/toy_mcp.py" ]

[mcp_servers.toy.env]
PYTHON_PATH = /home/exampleuser"
EXAMPLE_ENVIRONMENT_VAR = "CONTENT OF ENVIRONMENT VARIABLE"

Interacting with MQTT via an MCP Server

Now lets do something useful - let’s make MQTT a first-class citizen for LLM agent orchestrators. MQTT is the lingua franca of devices, labs, and home automation. Exposing it through MCP lets an agent safely discover topics, subscribe to live data, and publish commands - all within the same agent workflow (either interactive through a chat session or via an background agent). A few concrete things this unlocks:

Sense the real world. Subscribe to temperature, vibration, or power meters; read machine status from CNC machines or 3D printers; collect GPS data from trackers; watch door, window or motion sensors.
Trigger actions on hardware. Publish start/stop, mode changes or setpoints to robots, pumps, lights, HVAC, irrigation, shutters - anything that already speaks MQTT (directly or via a bridge).
3D printer and CNC control. Start prints, send “pause”, “resume” or “set temperature” commands; request current layer/state; route alerts into chat; kick off a maintenance macro. React to finished prints
Home automation. Toggle scenes, dim lights, arm/disarm alarms, open gates - while keeping commands constrained to allowed topics.
RPC workflows. Ask a device or service for status, health or metrics on a response topic; trigger one-off jobs (like executing calibrations, taking snapshots, homing devices, etc.) and wait for the reply.
Interact with different services and microservices. Start CI jobs, deploy canary services, fan out data processing tasks or notify downstream microservices - MQTT as a light control bus for your microservices zoo.
Update information on dashboards. Subscribe once and forward readings into a database, generate alerts, or feed a live dashboard - handy for experiments and long-running tests.
Human-in-the-loop safety. Route dangerous commands to an approval queues; separate dry-run from apply topics; log every action and response for audit.
Edge-cloud bridge. You can utilize this as a bridge between edge devices and the cloud.

In short: the MQTT MCP gives your agent eyes (subscribe), hands (publish) and a voice for structured conversations with devices and services (request/response) - without leaving the chat or agent workflow or having to program in the traditional way.

The implementation

The implementation of this MCP is a little bit more complex and can be found on GitHub and can also be installed from PyPi via

pip install mcpMQTT

The stdio protocol based MCP server is configured via a single configuration file at ~/.config/mcpmqtt/config.json or at a configurable location specified via the --config parameter. An example configuration file looks like

{
  "mqtt": {
    "host": "localhost",
    "port": 1883,
    "username": null,
    "password": null,
    "keepalive": 60
  },
  "topics": [
    {
      "pattern": "sensors/+/temperature",
      "permissions": ["read"],
      "description": "Temperature sensor data from any location (+ matches single level like 'room1', 'room2'. Known rooms are 'exampleroom1' and 'exampleroom2'). Use subscribe, not read on this topic. Never publish."
    },
    {
      "pattern": "sensors/+/humidity",
      "permissions": ["read"],
      "description": "Humidity sensor data from any location. (+ matches single level like 'room1', 'room2'. Known rooms are 'exampleroom1' and 'exampleroom2'). Use subscribe, not read on this topic. Never publish. Data returned as %RH"
    },
    {
      "pattern": "actuators/#",
      "permissions": ["write"],
      "description": "All actuator control topics (# matches multiple levels like 'lights/room1'. To enable a light you write any payload to 'lights/room1/on', to disable you write to 'lights/room1/off')"
    },
    {
      "pattern": "status/system",
      "permissions": ["read"],
      "description": "System status information - exact topic match"
    },
    {
      "pattern": "commands/+/request",
      "permissions": ["write"],
      "description": "Command request topics for request/response patterns"
    },
    {
      "pattern": "commands/+/response",
      "permissions": ["read"],
      "description": "Command response topics for request/response patterns"
    }
  ],
  "logging": {
    "level": "INFO",
    "logfile": null
  }
}

The sections in the main JSON object are:

mqtt: Contains the broker configuration
topics: Provides pattern, permissions on the given topics and a description that is used by the LLM to select which topic to use.
logging provides logging configuration (in a crude way) for debugging.

The topic permissions allow one to perform filtering of which topics and be subscribes or published to by the agent orchestrator - this works in addition to the message broker configuration.

The tools provided are

mqtt_publish to simply publish an event to the message broker (including payload)
mqtt_subscribe to subscribe to a topic and collect a specified maximum number of messages (or reach the timeout)
mqtt_read subscribes to a specific topic, waits for a single message and then removes the subscription again
mqtt_query transmits a request and waits for a single reply (on two different topics - do not forget to use correlation IDs to identify which response belongs to which request, this is not done by the MCP server since it knows nothing about the service)

In addition the MCP server exposes two resources. First the mcpmqtt://topics/allowed resource provides a list of all useable topics and their permissions. In addition mcpmqtt://topics/examples also provides examples to the agent orchestrators.

Keep in mind that the MCP grants whatever the broker permits and that is whitelisted in configuration - lock it down at the broker and in the MCP config properly or you will encounter interesting situations.

Why This Design Works Well for Agents

The design makes the MQTT space self-describing to an agent. Instead of handing the model a static set of methods, the server exposes resources that enumerate what’s actually allowed at runtime. mcpmqtt://topics/allowed returns concrete patterns, permissions, and broker hints (host/port; whether auth is required), and you can also serve example expansions so the agent sees how + and # wildcards materialize into real topics. Under the hood this is populated straight from the live config via a small global that the resource reads, so discoverability stays in sync with whatever you’ve deployed.

Safety comes from checking the rules before touching the wire. Every tool - publish, subscribe, one-shot read and request/response - verifies the requested topic against the permission set (read and/or write) and only proceeds if it matches one of the configured patterns. Because the same wildcard semantics used by MQTT are enforced in the MCP layer, an agent can’t accidentally publish to or subscribe from a disallowed branch even if it guesses a valid-looking string or wants to exploit your architecture in a way that you do not desire; the tool simply refuses and returns a clear error. This pushes guardrails to the controllable edge, where they belong, and keeps the agent’s often nondeterministic behavior constrained to the intended slice of the broker. This prevents bad surprises from hallucinations or a run away context.

The async story is robust. Paho delivers messages from its own thread; the client manager records “who’s waiting for what” in a dictionary of asyncio.Futures and when a message arrives it completes the right future on the correct event loop using loop.call_soon_threadsafe(...) - the thread-safe handoff that avoids racey cross-thread mutations. Waiting is bounded (asyncio.wait_for) with cleanup that removes stale futures, and the RPC helper arms the response listener before publishing the request to prevent missed replies. The MCP server also wraps connection setup/teardown in a lifespan context so tools only run with a live broker and shut down cleanly. The result is an agent interface that’s non-blocking, race-aware, and predictable under timeouts.

A mini code tour

`mcp_server.py`

The server is created again via the FastMCP constructor. In contrast to the simple example above a lifespan context manager is also supplied that loads the configuration, constructs the MQTT client, yields a context, and cleans up on shutdown. Topic permissions are validated before any network action with validate_topic_permission(...) against your configurations patterns. A global _current_config lets other parts of the application read the active configuration to render discoverable topic info.

The tools are again exposed via the @mcp.tool annotation - but in contrast to our simple example the functions carry additional annotations - a title for each tool as well as the following hints:

readOnlyHint signals that the method is a simple getter that has no side effects.
destructiveHint signals that a method has side effects that cannot be undone.
idempotentHint tells the orchestrator that the method is idempotent. Multiple calls do not accumulate and yield the same result as a single call.
openWorldHint tells the orchestrator that he is interacting with the open world and not a closed environment.

Resources are annotated with @mcp.resource and get an mcpmqtt:// URL style prefix.

`mqtt_client.py`

This contains the MQTT client manager. It builds on paho-mqtt with a thin async layer. It manages connection state and performs all actions exposed to the orchestrator. A pending_responses dictionary maps to the asyncio Futures that are triggered whenever an operation finishes in the paho-mqtt thread. This allows signalling the finished operations into the Futures asyncio loop (a different thread) in a thread safe way. The wait_for_message method utilizes this mechanism by awaiting on the Future that represent it’s request.

Conclusion

MCP provides a standardized, modular way to expose tools, resources and prompts to orchestrators - dynamically, at runtime, and without baking assumptions into any single agent framework. By separating capability description from transport and execution, it gives you discoverability, permission boundaries, and composability out of the box. Writing an MCP server is intentionally simple: the toy MCP shows that a few clear tool definitions and a tiny event loop are enough to get a real, inspectable capability surface that any MCP-aware orchestrator can use.

Building on that, the shown example MQTT MCP provides a clean interface to controlled interaction with the real world - IoT devices, home automation, lab gear and robots. Topics are discoverable yet fenced by allowlists; publish/subscribe and request/reply are guarded by timeouts and correlation; async hand-off keeps the client robust under load. The result is a portable, permissioned “hands-and-eyes” layer that scales from a breadboard sensor to a building automation system while remaining easy to reason about and safe to operate.

Getting started with MCP servers

From Function Calling to Model Context Protocol

Core Components of MCP

Tools (Methods, Functions)

Prompts (Templates)

Resources

Lifecycle of an MCP Server

Transport Mechanisms

Standard I/O (stdin/stdout)

Streamable HTTP

Session Handling in the FastMCP SDK

Server-Sent Events (SSE)

Typical applications

Actual (hopefully useful) implementations

The toy MCP

Interacting with MQTT via an MCP Server

The implementation

Why This Design Works Well for Agents

A mini code tour

mcp_server.py

mqtt_client.py

Conclusion

Related articles

The role of message brokers in IoT and microservice scenarios

Why I think that Jabber/XMPP is one of the best chat system available

Bridging networks (VPNs)

mini-apigw: A Lightweight Gateway for Multi-Model AI Infrastructure

A short story of the migration process for The-Things-Network v2 to v3

ISC-DHCPD events triggering native hooks from within a chroot

Architecting Intelligence: A Comprehensive Guide to LLM Agent Patterns and Behaviors

Implementing a simple websocket server in Python

Also on this blog

Using the RaspberryPi GPIOs on FreeBSD

Bremsgleichung und Bremsdichte

A mini recipe to fit arbitrary models to data using lmfit in Python

Anatomy of a typical Unix-like/Linux daemon

`mcp_server.py`

`mqtt_client.py`