Update calls
To interact with a canister's methods, there are two primary types of calls that can be used: update calls and query calls.
Update calls can make modifications to a canister's state, while query calls cannot. Update calls are executed on all nodes of the subnet that the canister is deployed on since the result of the call must go through consensus. This makes them slower than query calls. They are submitted asynchronously and answered synchronously.
Update calls do not go through consensus on the local replica.
A query call can be called both an update and a query, whereas update methods can be called only as an update.
Update calls | Query calls |
---|---|
Slow (1-2s) | Fast (200-400ms) |
Can modify state | Can't modify state |
Goes through consensus | Does not go through consensus |
Synchronous response | Synchronous response |
Executed on all nodes of a subnet | Executed on a single node |
Cost cycles | Free |
See the reference on ingress messages for a more technical discussion of this topic.
Making update calls with dfx
To make an update call to a canister, use the dfx canister call
command with the --update
flag:
dfx canister call --update <canister-name> <method_name>
: Make an update call to a canister deployed locally. The local replica must be running to create a canister locally. Start it withdfx start --background
.dfx canister call --update <canister-name> <method_name> --network=playground
: Make an update call to a canister deployed on the playground. Making update calls to canisters deployed on the playground is free, but canisters are temporary and will be removed after 20 minutes.dfx canister call --update <canister-name> <method_name> --network=ic
: Make an update call to a canister deployed on the mainnet. Update calls will cost cycles.
Making update calls from within canisters
- Motoko
- Rust
// Set the value of the counter.
public func set(n : Nat) : async () {
counter := n;
};
/// Set the value of the counter.
#[ic_cdk_macros::update]
fn set(n: Nat) {
// COUNTER.replace(n); // requires #![feature(local_key_cell_methods)]
COUNTER.with(|count| *count.borrow_mut() = n);
}
Update call errors and their implications
A number of errors could occur in the above flow. The following are examples of errors, indicated using the numbers in the diagram:
- Due to misconfiguration, the agent might fail to successfully sign the update call and not even send it.
- The call to the boundary node may fail because, for example, the agent or the boundary node is offline and the call times out.
- Calls (
call
orread_state
) to the IC node may fail due to possible reasons such as: 1. The node is offline. 2. An execution pre-processing error (status200
in case of the async call) occurs. 3. The node is incapable of handling the request (status5xx
). - It could be that status responses are never received. For example, a malicious boundary node could block the status response. Thus, the agent would never see a status like
received
,processing
, etc. upon callingread_state
. - The certificate in the
read_state
response could be invalid. For example, it could be malformed or have an invalid signature. - Finally, the call could be
rejected
(see rejected status and reject codes).
When seeing errors 2-5, the call may still be successfully executed. For example, the users might lose network connectivity, or a malicious boundary node or replica could return an error but still successfully execute the request. Only if a rejected
response is received along with a valid certificate, the subnet guarantees that the call hasn’t been executed and never will be.
Some operations, such as transfers, must not be executed multiple times. If a call was rejected
as described above, it is safe to retry the operation. In all other error cases, however, retrying can be risky because the call could still successfully execute, potentially leading to severe security vulnerabilities such as double spending.
In general, there are cases where it is impossible to determine if a call was successful. For example, suppose that:
The client loses its connection until the request status has been removed from the state tree. (Recall that ICP will remove the request from the system state tree some time after ingress expiry.)
There is no application-specific way to tell if the call successfully executed. For example, the canister does not provide a way to find out if the call happened.
In this situation, it is indistinguishable if the call was successful, but all information has already been removed from the system state tree, or the call never made it into the state tree in the first place and hasn't been executed.
Recommendations and best practices for secure update call handling
The best way to handle the aforementioned errors is to implement safe retries using idempotent calls. Idempotency and several patterns to achieve it are discussed in the best practices on safe retries and idempotency.
It may not always be possible to make call endpoints idempotent. For example, an endpoint that is not under the developer’s control may simply not have this property. In this case, there is still a secure way to retry the calls by relying on built-in deduplication based on request IDs, as we describe below. However, the guarantees in this case are weaker compared to using idempotent calls.
ICP provides a built-in deduplication feature. Namely, if two calls with identical request id are made, the message is processed at most once. For two calls to have the same request_id
, they must have the same values for all fields: request_type
, sender
, nonce
, canister_id
, method_name
, arg
, and ingress_expiry. This gives idempotency, but only until the ingress_expiry
is reached, at which point the message expires and ICP may remove the request_id
from the system state tree. This isn't an issue since the same request will not be accepted later, as it expired. Thus, such requests can be safely retried analogously to how it is done for idempotent calls, see the illustration on that page.
As ICP rejects ingress_expiry
that is too far in the future, the time window for retrying is limited. To maximize the window, one can maximize ingress_expiry
in this case and time out the retries after ingress_expiry
as well.
In summary, there are two options for a client:
No retries, only poll
read_state
: Submit thecall
once, and pollread_state
until at least one of the following conditions is met:- A certified response was produced. That is, the call status is
replied
,rejected
, ordone
. - The subnet time exceeds
ingress_expiry
, and the call status is notprocessing
.
- A certified response was produced. That is, the call status is
Evaluating the second condition may be hard if the subnet time and local time of the user differ.
Retry the call while polling
read_state
: Do the same as above, but resubmit the initialcall
(samerequest_id
) periodically (e.g., every30s
) until at least one of the following conditions is met:- The call is known to the system. That is, the call status is
received
,processing
,replied
,rejected
, ordone
. ingress_expiry
is reached.
- The call is known to the system. That is, the call status is
This can be beneficial because a call that didn’t make it is retried. For example, if the first call was dropped due to a network issue or by a malicious replica, a subsequent call could make it through, e.g., if it is successfully routed to an honest replica.
At the time of writing, both agent-js (v2.1.3) and agent-rs (v0.39.1) don't retry call
, so they follow the first option above. By default, both agent-rs and agent-js use ingress_expiry
of four minutes and poll for five minutes. Additionally, agent-rs resets the timer to allow for an additional five minutes of polling whenever it observes a received
or processing
status.
Some time after expiry is reached, there is no way for the client to determine if the call was successful unless the canister’s API provides this information. In this case, the client can check the status of the call using application-specific APIs. There can be (hopefully uncommon) cases where the APIs don't allow checking this, in which case the client simply can’t tell if the message has been executed. Ideally, applications set expiry values such that under typical load, timeouts are rare.