Sign in

Keepers and Chainlink Automation

The passivity problem

A contract has no clock, no scheduler, and no background thread. Nothing inside it ever runs on its own. Every state change traces back to some externally owned account, or another contract, sending a transaction that calls in. Between transactions the contract is inert. It does not poll, it does not wake on a timer, it does not react to the passage of time except by reading block.timestamp when something else calls it.

This is fine for a lot of contracts. An ERC-20 transfer happens because someone asked for it. A swap happens because a trader submitted it. The user supplies both the intent and the gas.

The problem shows up the moment a contract's correct behavior depends on something happening without a user present to make it happen. A few examples that come up constantly:

A lending protocol needs undercollateralized positions liquidated the instant their health factor drops below 1. No borrower is going to liquidate themselves.

A vesting contract needs to release the next tranche of tokens on a schedule, every month, whether or not the beneficiary remembers to claim.

A limit order needs to execute the moment the price crosses a threshold, which might be at 3am when nobody is watching.

A yield vault needs to harvest and recompound rewards on a regular cadence to stay competitive.

In every case the contract knows what should happen and can check whether the condition is met. What it cannot do is trigger itself. Someone has to send the transaction.

The naive fixes and where they break

The first instinct of most developers coming from web2 is to run a bot. Spin up a server, poll the chain every block, and when the condition is met, sign and send the transaction from a hot wallet. This works in a demo and fails in production for predictable reasons.

The bot is a single point of failure. If your server goes down, your dead-man's switch dies with it. If the hot wallet runs out of ETH for gas, every upkeep stalls. If the machine is compromised, the attacker has a funded key. You have reintroduced exactly the centralized, trusted operator that the contract was supposed to do without. For a protocol that markets itself as decentralized, a critical function gated behind one team's cron job is a contradiction users will eventually notice.

The second instinct is to make the work permissionless and let economics handle it. Anyone can call liquidate(), and the caller takes a cut of the seized collateral. This is the pattern the lending lecture covered, and it works beautifully when the task carries a built-in profit. Liquidations get done because liquidators race for the bonus. The same logic drives arbitrage and the closing of expired options.

It falls apart for any task that has no natural payout. Nobody will pay gas to call your vault's harvest() out of goodwill. Nobody will spend money to advance a vesting schedule that benefits someone else. You can try to bolt on an artificial reward, but now you are paying a bounty on every call and inviting people to game the timing. For unprofitable maintenance work, the incentivized-caller model has no one to incentivize.

What both fixes are reaching for is a reliable, decentralized party that will send the transaction when the condition is met and charge you a fair price for the gas. That is what Chainlink Automation is.

Automation, called Chainlink Keepers until its 2022 rename, is a network of independent nodes that monitor registered upkeeps and execute them on-chain when they are due. The same decentralization and crypto-economic guarantees behind Chainlink's price feeds apply here. No single node decides whether your function runs, and the network is paid in LINK to cover gas plus a premium.

The custom-logic integration rests on a two-function interface your contract implements.

solidity
// Solidity 0.8.20, Ethereum mainnet
interface AutomationCompatibleInterface {
    function checkUpkeep(bytes calldata checkData)
        external
        returns (bool upkeepNeeded, bytes memory performData);

    function performUpkeep(bytes calldata performData) external;
}

The split between these two functions is the whole idea, and it mirrors the off-chain-decides, on-chain-executes shape you already saw with VRF.

checkUpkeep answers one question: is there work to do right now. The Automation nodes call it off-chain, as a simulation, on every block. Because the call never lands in a block, it costs no gas and changes no state. You can read as much as you need inside it. It returns a boolean and an optional performData blob that gets handed to the execution step.

performUpkeep does the actual work. When checkUpkeep returns true, a node packages a real transaction that calls performUpkeep on-chain, pays the gas, and the work happens. This is the only part that touches the chain and the only part you pay for.

The Automation cycle: check off-chain every block, execute on-chain when needed OFF-CHAIN (simulated, no gas, runs constantly) Automation node network decentralized keepers checkUpkeep() your contract, view only "is there work to do?" simulate every block mostly false when checkUpkeep returns true, a real transaction crosses to chain upkeepNeeded == true ON-CHAIN (a real transaction, costs gas, paid in LINK) Registry + Forwarder sends the tx, pays gas performUpkeep() your contract, does the work re-checks, then acts calls with performData LINK balance funds the upkeep The off-chain check is free and constant. Only performUpkeep touches the chain and costs gas.

The three trigger types

Automation exposes three ways to fire an upkeep, and picking the right one keeps your contract simple.

Time-based triggers run a function on a fixed schedule given as a cron expression, every hour, every day at midnight, the first of each month. You point the registration at a target function and a schedule, and you write no checkUpkeep at all. This is the right tool for vesting releases, periodic rebalances, and anything that runs on the calendar rather than on a condition.

Custom-logic triggers use the checkUpkeep and performUpkeep pair above. The nodes poll your checkUpkeep every block, and when it returns true they call performUpkeep. This is the tool for condition-based work where you cannot predict the timing in advance, a price crossing a threshold, a quorum being reached, a buffer filling up.

Log triggers fire when a specific event log is emitted on-chain. Instead of checkUpkeep, you implement checkLog, which receives the matching log and decides whether to act. This suits event-driven reactions, responding to a deposit, a governance vote, or a cross-contract signal, without polling state every block.

For most non-scheduled DeFi mechanics, custom logic is the default, so the rest of this lecture stays with it.

Wiring it up: a stop-loss order

Here is a contract that sells a held asset when its price drops below a floor. It reads a Chainlink price feed, the same latestRoundData and staleness check from the oracle lecture, and it uses Automation to fire the sale automatically.

solidity
// Solidity 0.8.20, Ethereum mainnet
// Automation-compatible stop-loss: sell when the price falls below a floor.

import {AutomationCompatibleInterface} from
    "@chainlink/contracts/src/v0.8/automation/AutomationCompatible.sol";
import {AggregatorV3Interface} from
    "@chainlink/contracts/src/v0.8/shared/interfaces/AggregatorV3Interface.sol";

contract StopLoss is AutomationCompatibleInterface {
    AggregatorV3Interface public immutable priceFeed;
    int256 public immutable floorPrice;   // e.g. 1800 * 1e8 for an 8-decimal feed
    address public immutable owner;
    address public forwarder;             // assigned after the upkeep is registered
    bool public triggered;

    error NotOwner();
    error NotForwarder();
    error PriceStale();
    error ConditionNotMet();

    constructor(address feed, int256 floor) {
        priceFeed = AggregatorV3Interface(feed);
        floorPrice = floor;
        owner = msg.sender;
    }

    // The registry hands you a forwarder address after registration.
    // Lock performUpkeep to it so nobody else can trigger the sale.
    function setForwarder(address fwd) external {
        if (msg.sender != owner) revert NotOwner();
        forwarder = fwd;
    }

    // Simulated off-chain by the nodes on every block. Never mined.
    function checkUpkeep(bytes calldata)
        external
        view
        override
        returns (bool upkeepNeeded, bytes memory)
    {
        if (triggered) return (false, "");
        (, int256 price,, uint256 updatedAt,) = priceFeed.latestRoundData();
        bool fresh = block.timestamp - updatedAt < 1 hours;
        upkeepNeeded = fresh && price > 0 && price < floorPrice;
    }

    // Mined on-chain by the forwarder when checkUpkeep returned true.
    function performUpkeep(bytes calldata) external override {
        if (msg.sender != forwarder) revert NotForwarder();

        // Re-validate. The off-chain result is already a few blocks old.
        (, int256 price,, uint256 updatedAt,) = priceFeed.latestRoundData();
        if (block.timestamp - updatedAt >= 1 hours) revert PriceStale();
        if (triggered || price <= 0 || price >= floorPrice) revert ConditionNotMet();

        triggered = true;
        _executeSell();   // swap the held asset against a DEX router, omitted here
    }

    function _executeSell() internal {
        // ... routing and slippage protection ...
    }
}

checkUpkeep is marked view. It reads the feed, checks freshness, and reports whether the price has fallen below the floor. The Automation contracts also offer a cannotExecute modifier you can add to guarantee the function reverts if anyone tries to call it in a real transaction, since it is meant for simulation only.

performUpkeep is where the discipline lives, and it is worth slowing down on.

The re-validation requirement

Look at performUpkeep again. It does not assume the sale should happen just because it was called. It re-reads the price and checks the condition from scratch, reverting if the price is stale or has climbed back above the floor.

This re-check is not defensive padding. It is required for correctness, and the reason is timing. checkUpkeep runs off-chain at block N. By the time a node builds the transaction, broadcasts it, and it lands in a block, two or three blocks have passed. The state the node observed when it decided to act is no longer the current state. Prices move, balances change, other transactions execute in between.

The gap: state can change between the check and the execution time block N block N+2 the node builds and mines the tx: a few blocks pass checkUpkeep (off-chain) price = $1,790 below $1,800 → true performUpkeep (on-chain) price = $1,815 back above $1,800 if it trusts the stale "true" sells at $1,815, not $1,790 the trigger condition is gone wrong execution, a real bug if it re-reads the price sees $1,815, condition false reverts or returns early no bad trade, safe performUpkeep must re-validate the condition. The off-chain "true" is a hint, never a guarantee.

If performUpkeep trusted the stale decision, it would sell at a price the user never agreed to, after the stop-loss condition had already cleared. By re-reading the feed and reverting when the condition is gone, the contract guarantees it only ever acts on conditions that are true at the moment of execution. The off-chain true is a hint that work might be needed. The on-chain function decides whether it actually is.

This generalizes to every Automation integration. Treat checkUpkeep as a cheap filter that tells the network when to bother sending a transaction, and treat performUpkeep as the authority that re-establishes every precondition before doing anything irreversible.

Access control and the forwarder

performUpkeep is a public function. By default, any address can call it, including ones that have nothing to do with the Automation network. For some upkeeps that is harmless, because the function re-validates and a stranger calling it early just wastes their own gas. For others it is a real risk, because an attacker can call it at a moment that benefits them, front-run the legitimate execution, or pass a malicious performData blob.

The fix is the forwarder. When you register an upkeep, the registry generates a unique forwarder address for it, and that address is the only one the network uses to call your performUpkeep. Once you know it, restrict the function to msg.sender == forwarder, as the stop-loss does. Now the network can trigger the work and nobody else can.

Do not hardcode the forwarder in the constructor. You do not know its address until after registration, which happens after deployment. Set it once afterward through an owner-guarded setter.

Funding and lifecycle

An upkeep is paid in LINK. You register it at the Automation app or programmatically against the registry, deposit a LINK balance, and the network deducts from that balance each time it runs performUpkeep, covering the gas plus a premium that pays the node.

When the LINK balance runs dry, the upkeep stops. It does not warn you in the contract, it does not revert, it simply stops being executed, and your liquidations or releases quietly fail to happen. Monitoring the balance and topping it up is operational work you own. Several teams have shipped a contract that worked perfectly in testing and then watched a critical upkeep go silent weeks later because nobody refilled the LINK.

The other limit to plan for is gas. Each upkeep has a configured gas limit for performUpkeep. If your execution exceeds it, the transaction reverts and the work does not get done. This makes unbounded loops especially dangerous here, because the loop might fit under the limit in a test with three items and blow past it in production with three thousand.

What people get wrong

Trusting the off-chain check. The most common and most damaging mistake is writing a performUpkeep that does the work unconditionally because "checkUpkeep already verified it." By the time performUpkeep runs, the check is several blocks stale. Re-validate every precondition on-chain.

Leaving performUpkeep open. Forgetting the forwarder guard exposes the function to anyone. Depending on what the upkeep does, that ranges from harmless to exploitable. Lock it to the forwarder unless you have a specific reason the function is safe to call by anyone.

Fearing an expensive checkUpkeep. Developers sometimes cram complex logic into performUpkeep to keep checkUpkeep cheap, which is backwards. checkUpkeep runs off-chain and costs nothing to simulate, within the node's generous limits, so it is the right place for heavy reads and iteration. performUpkeep is the part that costs real gas and runs under a hard limit, so keep it lean and let the check do the looking.

Unbounded loops in performUpkeep. A loop over all positions or all users passes in a small test and reverts in production once the set grows past the gas limit. Cap the work per execution, or use performData to tell performUpkeep exactly which items to process, computed during the free off-chain check.

Expecting exact timing. Automation is best-effort. It fires soon after the condition is met, usually within a block or two. The timing is approximate, so anything that needs sub-block precision or guaranteed execution in a specific block should not depend on it.

The raffle task you just built is a clean candidate for this pattern. Its drawWinner currently relies on someone remembering to call it after the deadline. A time-based upkeep pointed at drawWinner, or a custom-logic upkeep whose checkUpkeep returns true once block.timestamp passes the deadline, removes that dependency and makes the draw fire on its own. The contract stays passive and correct, and the network supplies the missing trigger.