turmoil/lib.rs
1//! Turmoil is a framework for testing distributed systems. It provides
2//! deterministic execution by running multiple concurrent hosts within a single
3//! thread. It introduces "hardship" into the system via changes in the
4//! simulated network. The network can be controlled manually or with a seeded
5//! rng.
6//!
7//! # Hosts and Software
8//!
9//! A turmoil simulation is comprised of one or more hosts. Hosts run software,
10//! which is represented as a `Future`.
11//!
12//! Test code is also executed on a special host, called a client. This allows
13//! for an entry point into the simulated system. Client hosts have all the same
14//! capabilities as normal hosts, such as networking support.
15//!
16//! ```
17//! use turmoil;
18//!
19//! let mut sim = turmoil::Builder::new().build();
20//!
21//! // register a host
22//! sim.host("host", || async {
23//!
24//! // host software goes here
25//!
26//! Ok(())
27//! });
28//!
29//! // define test code
30//! sim.client("test", async {
31//!
32//! // we can interact with other hosts from here
33//!
34//! Ok(())
35//! });
36//!
37//! // run the simulation and handle the result
38//! _ = sim.run();
39//!
40//! ```
41//!
42//! # Networking
43//!
44//! Simulated networking types that mirror `tokio::net` are included in the
45//! `turmoil::net` module.
46//!
47//! Turmoil is not yet oppinionated on how to structure your application code to
48//! swap in simulated types under test. More on this coming soon...
49//!
50//! # Network Manipulation
51//!
52//! The simulation has the following network manipulation capabilities:
53//!
54//! * [`partition`], which introduces a network partition between hosts
55//! * [`repair`], which repairs a network partition between hosts
56//! * [`hold`], which holds all "in flight" messages between hosts. Messages are
57//! available for introspection using [`Sim`]'s `links` method.
58//! * [`release`], which releases all "in flight" messages between hosts
59//!
60//! # Filesystem (unstable)
61//!
62//! *Requires the `unstable-fs` feature.*
63//!
64//! Simulated filesystem types that mirror `std::fs` and `std::os::unix::fs` are
65//! included in the [`fs::shim`] module. This enables crash-consistency testing
66//! for storage systems.
67//!
68//! ```ignore
69//! use turmoil::fs::shim::std::fs::OpenOptions;
70//! use std::os::unix::fs::FileExt; // Real trait, works with our File
71//!
72//! # fn example() -> std::io::Result<()> {
73//! let file = OpenOptions::new()
74//! .read(true)
75//! .write(true)
76//! .create(true)
77//! .open("/data/db")?;
78//!
79//! file.write_all_at(b"data", 0)?;
80//! file.sync_all()?; // Data now durable, survives crash
81//! # Ok(())
82//! # }
83//! ```
84//!
85//! Key features:
86//!
87//! * **Per-host isolation**: Each host has its own filesystem namespace
88//! * **Durability model**: Writes go to a pending buffer and become durable on
89//! `sync_all()` or randomly based on [`FsConfig::sync_probability`]
90//! * **Crash behavior**: [`Sim::crash`] discards pending writes; synced data
91//! survives [`Sim::bounce`]
92//!
93//! # Barriers (unstable)
94//!
95//! *Requires the `unstable-barriers` feature.*
96//!
97//! Barriers allow tests to observe and control source code execution by
98//! injecting hooks at specific points. This enables deterministic testing
99//! of complex scenarios without relying on timing or network manipulation.
100//!
101//! ```ignore
102//! use turmoil::barriers::{Barrier, Reaction, trigger};
103//!
104//! // In source code (conditionally compiled)
105//! trigger(MyEvent::PrepareAckReceived(tx_id)).await;
106//!
107//! // In test code
108//! let mut barrier = Barrier::build(Reaction::Suspend, |e: &MyEvent| {
109//! matches!(e, MyEvent::PrepareAckReceived(_))
110//! });
111//!
112//! // Run simulation until barrier triggers
113//! let triggered = barrier.wait().await.unwrap();
114//! // Source code is now suspended
115//!
116//! // Resume by dropping the handle
117//! drop(triggered);
118//! ```
119//!
120//! # Tracing
121//!
122//! The `tracing` crate is used to emit important events during the lifetime of
123//! a simulation. To enable traces, your tests must install a
124//! [`tracing-subscriber`](https://docs.rs/tracing-subscriber/latest/tracing_subscriber/).
125//! The log level of turmoil can be configured using `RUST_LOG=turmoil=info`.
126//!
127//! It is possible to configure your tracing subscriber to log elapsed
128//! simulation time instead of real time. See the grpc example.
129//!
130//! Turmoil can provide a full packet level trace of the events happening in a
131//! simulation by passing `RUST_LOG=turmoil=trace`. This is really useful
132//! when you are unable to identify why some unexpected behaviour is happening
133//! and you need to know which packets are reaching where.
134//!
135//! To see this in effect, you can run the axum example with the following
136//! command:
137//!
138//! ```bash
139//! RUST_LOG=INFO,turmoil=TRACE cargo run -p axum-example
140//! ```
141//!
142//! You can see the TCP packets being sent and delivered between the server
143//! and the client:
144//!
145//! ```bash
146//! ...
147//! 2023-11-29T20:23:43.276745Z TRACE node{name="server"}: turmoil: Send src=192.168.0.1:9999 dst=192.168.0.2:49152 protocol=TCP [0x48, 0x54, 0x54, 0x50, 0x2F, 0x31, 0x2E, 0x31, 0x20, 0x32, 0x30, 0x30, 0x20, 0x4F, 0x4B, 0xD, 0xA, 0x63, 0x6F, 0x6E, 0x74, 0x65, 0x6E, 0x74, 0x2D, 0x74, 0x79, 0x70, 0x65, 0x3A, 0x20, 0x74, 0x65, 0x78, 0x74, 0x2F, 0x70, 0x6C, 0x61, 0x69, 0x6E, 0x3B, 0x20, 0x63, 0x68, 0x61, 0x72, 0x73, 0x65, 0x74, 0x3D, 0x75, 0x74, 0x66, 0x2D, 0x38, 0xD, 0xA, 0x63, 0x6F, 0x6E, 0x74, 0x65, 0x6E, 0x74, 0x2D, 0x6C, 0x65, 0x6E, 0x67, 0x74, 0x68, 0x3A, 0x20, 0x31, 0x30, 0xD, 0xA, 0x64, 0x61, 0x74, 0x65, 0x3A, 0x20, 0x57, 0x65, 0x64, 0x2C, 0x20, 0x32, 0x39, 0x20, 0x4E, 0x6F, 0x76, 0x20, 0x32, 0x30, 0x32, 0x33, 0x20, 0x32, 0x30, 0x3A, 0x32, 0x33, 0x3A, 0x34, 0x33, 0x20, 0x47, 0x4D, 0x54, 0xD, 0xA, 0xD, 0xA, 0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x20, 0x66, 0x6F, 0x6F, 0x21]
148//! 2023-11-29T20:23:43.276834Z DEBUG turmoil::sim: step 43
149//! 2023-11-29T20:23:43.276907Z DEBUG turmoil::sim: step 44
150//! 2023-11-29T20:23:43.276981Z DEBUG turmoil::sim: step 45
151//! 2023-11-29T20:23:43.277039Z TRACE node{name="client"}: turmoil: Delivered src=192.168.0.1:9999 dst=192.168.0.2:49152 protocol=TCP [0x48, 0x54, 0x54, 0x50, 0x2F, 0x31, 0x2E, 0x31, 0x20, 0x32, 0x30, 0x30, 0x20, 0x4F, 0x4B, 0xD, 0xA, 0x63, 0x6F, 0x6E, 0x74, 0x65, 0x6E, 0x74, 0x2D, 0x74, 0x79, 0x70, 0x65, 0x3A, 0x20, 0x74, 0x65, 0x78, 0x74, 0x2F, 0x70, 0x6C, 0x61, 0x69, 0x6E, 0x3B, 0x20, 0x63, 0x68, 0x61, 0x72, 0x73, 0x65, 0x74, 0x3D, 0x75, 0x74, 0x66, 0x2D, 0x38, 0xD, 0xA, 0x63, 0x6F, 0x6E, 0x74, 0x65, 0x6E, 0x74, 0x2D, 0x6C, 0x65, 0x6E, 0x67, 0x74, 0x68, 0x3A, 0x20, 0x31, 0x30, 0xD, 0xA, 0x64, 0x61, 0x74, 0x65, 0x3A, 0x20, 0x57, 0x65, 0x64, 0x2C, 0x20, 0x32, 0x39, 0x20, 0x4E, 0x6F, 0x76, 0x20, 0x32, 0x30, 0x32, 0x33, 0x20, 0x32, 0x30, 0x3A, 0x32, 0x33, 0x3A, 0x34, 0x33, 0x20, 0x47, 0x4D, 0x54, 0xD, 0xA, 0xD, 0xA, 0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x20, 0x66, 0x6F, 0x6F, 0x21]
152//! 2023-11-29T20:23:43.277097Z TRACE node{name="client"}: turmoil: Recv src=192.168.0.1:9999 dst=192.168.0.2:49152 protocol=TCP [0x48, 0x54, 0x54, 0x50, 0x2F, 0x31, 0x2E, 0x31, 0x20, 0x32, 0x30, 0x30, 0x20, 0x4F, 0x4B, 0xD, 0xA, 0x63, 0x6F, 0x6E, 0x74, 0x65, 0x6E, 0x74, 0x2D, 0x74, 0x79, 0x70, 0x65, 0x3A, 0x20, 0x74, 0x65, 0x78, 0x74, 0x2F, 0x70, 0x6C, 0x61, 0x69, 0x6E, 0x3B, 0x20, 0x63, 0x68, 0x61, 0x72, 0x73, 0x65, 0x74, 0x3D, 0x75, 0x74, 0x66, 0x2D, 0x38, 0xD, 0xA, 0x63, 0x6F, 0x6E, 0x74, 0x65, 0x6E, 0x74, 0x2D, 0x6C, 0x65, 0x6E, 0x67, 0x74, 0x68, 0x3A, 0x20, 0x31, 0x30, 0xD, 0xA, 0x64, 0x61, 0x74, 0x65, 0x3A, 0x20, 0x57, 0x65, 0x64, 0x2C, 0x20, 0x32, 0x39, 0x20, 0x4E, 0x6F, 0x76, 0x20, 0x32, 0x30, 0x32, 0x33, 0x20, 0x32, 0x30, 0x3A, 0x32, 0x33, 0x3A, 0x34, 0x33, 0x20, 0x47, 0x4D, 0x54, 0xD, 0xA, 0xD, 0xA, 0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x20, 0x66, 0x6F, 0x6F, 0x21]
153//! 2023-11-29T20:23:43.277324Z INFO client: axum_example: Got response: Response { status: 200, version: HTTP/1.1, headers: {"content-type": "text/plain; charset=utf-8", "content-length": "10", "date": "Wed, 29 Nov 2023 20:23:43 GMT"}, body: b"Hello foo!" }
154//! ...
155//! ```
156//!
157//! Here the server is sending a response, before it is delivered to, and
158//! received by the client. Note that there are three steps to each packet
159//! trace in turmoil. We see `Send` when a packet is sent from one address
160//! to another. The packet is then `Delivered` to its destination, and when
161//! the destination reads the packet it is `Recv`'d.
162//!
163//! # Feature flags
164//!
165//! * `regex`: Enables regex host resolution through `ToIpAddrs`
166//!
167//! ## tokio_unstable
168//!
169//! Turmoil uses [unhandled_panic] to forward host panics as test failures. See
170//! [unstable features] to opt in.
171//!
172//! [unhandled_panic]:
173//! https://docs.rs/tokio/latest/tokio/runtime/struct.Builder.html#method.unhandled_panic
174//! [unstable features]: https://docs.rs/tokio/latest/tokio/#unstable-features
175
176#[cfg(doctest)]
177mod readme;
178
179mod builder;
180
181use std::{net::IpAddr, time::Duration};
182
183pub use builder::Builder;
184
185mod config;
186use config::Config;
187
188mod dns;
189use dns::Dns;
190pub use dns::{ToIpAddr, ToIpAddrs, ToSocketAddrs};
191
192mod envelope;
193use envelope::Envelope;
194pub use envelope::{Datagram, Protocol, Segment};
195
196mod error;
197pub use error::Result;
198
199mod host;
200use host::Host;
201
202mod ip;
203pub use ip::IpVersion;
204
205#[cfg(feature = "unstable-fs")]
206pub mod fs;
207#[cfg(feature = "unstable-fs")]
208pub use fs::FsConfig;
209
210#[cfg(feature = "unstable-barriers")]
211pub mod barriers;
212
213pub mod net;
214
215mod rt;
216use rt::Rt;
217
218mod sim;
219pub use sim::Sim;
220
221mod top;
222use top::Topology;
223pub use top::{LinkIter, LinksIter, SentRef};
224
225mod world;
226use world::World;
227
228const TRACING_TARGET: &str = "turmoil";
229
230/// Utility method for performing a function on all hosts in `a` against all
231/// hosts in `b`.
232pub(crate) fn for_pairs(a: &Vec<IpAddr>, b: &Vec<IpAddr>, mut f: impl FnMut(IpAddr, IpAddr)) {
233 for first in a {
234 for second in b {
235 // skip for the same host
236 if first != second {
237 f(*first, *second)
238 }
239 }
240 }
241}
242
243/// Returns whether the caller is in a simulation context.
244pub fn in_simulation() -> bool {
245 World::try_current(|_| ()).is_ok()
246}
247
248/// Returns how long the currently executing host has been executing for in
249/// virtual time.
250///
251/// Must be called from within a Turmoil simulation.
252pub fn elapsed() -> Duration {
253 World::current(|world| world.current_host().timer.elapsed())
254}
255
256/// Returns how long the simulation has been executing for in virtual time.
257///
258/// Will return None if the duration is not available, typically because
259/// there is no currently executing host or world.
260pub fn sim_elapsed() -> Option<Duration> {
261 World::try_current(|world| {
262 world
263 .try_current_host()
264 .map(|host| host.timer.sim_elapsed())
265 .ok()
266 })
267 .ok()
268 .flatten()
269}
270
271/// The logical duration from [`UNIX_EPOCH`] until now.
272///
273/// On creation the simulation picks a `SystemTime` and calculates the
274/// duration since the epoch. Each `run()` invocation moves logical time
275/// forward the configured tick duration.
276///
277/// Will return None if the duration is not available, typically because
278/// there is no currently executing host or world.
279pub fn since_epoch() -> Option<Duration> {
280 World::try_current(|world| {
281 world
282 .try_current_host()
283 .map(|host| host.timer.since_epoch())
284 .ok()
285 })
286 .ok()
287 .flatten()
288}
289
290/// Lookup an IP address by host name.
291///
292/// Must be called from within a Turmoil simulation.
293pub fn lookup(addr: impl ToIpAddr) -> IpAddr {
294 World::current(|world| world.lookup(addr))
295}
296
297/// Perform a reverse DNS lookup, returning the hostname if the entry exists.
298///
299/// Must be called from within a Turmoil simulation.
300pub fn reverse_lookup(addr: IpAddr) -> Option<String> {
301 World::current(|world| world.reverse_lookup(addr).map(|h| h.to_owned()))
302}
303
304/// Lookup an IP address by host name. Use regex to match a number of hosts.
305///
306/// Must be called from within a Turmoil simulation.
307pub fn lookup_many(addr: impl ToIpAddrs) -> Vec<IpAddr> {
308 World::current(|world| world.lookup_many(addr))
309}
310
311/// Hold messages between two hosts, or sets of hosts, until [`release`] is
312/// called.
313///
314/// Must be called from within a Turmoil simulation.
315pub fn hold(a: impl ToIpAddrs, b: impl ToIpAddrs) {
316 World::current(|world| world.hold_many(a, b))
317}
318
319/// The opposite of [`hold`]. All held messages are immediately delivered.
320///
321/// Must be called from within a Turmoil simulation.
322pub fn release(a: impl ToIpAddrs, b: impl ToIpAddrs) {
323 World::current(|world| world.release_many(a, b))
324}
325
326/// Partition two hosts, or sets of hosts, resulting in all messages sent
327/// between them to be dropped.
328///
329/// Must be called from within a Turmoil simulation.
330pub fn partition(a: impl ToIpAddrs, b: impl ToIpAddrs) {
331 World::current(|world| world.partition_many(a, b))
332}
333
334/// Partition two hosts, or sets of hosts, in one direction.
335///
336/// Must be called from within a Turmoil simulation.
337pub fn partition_oneway(from: impl ToIpAddrs, to: impl ToIpAddrs) {
338 World::current(|world| world.partition_oneway_many(from, to))
339}
340
341/// Repair the connection between two hosts, or sets of hosts, resulting in
342/// messages to be delivered.
343///
344/// Must be called from within a Turmoil simulation.
345pub fn repair(a: impl ToIpAddrs, b: impl ToIpAddrs) {
346 World::current(|world| world.repair_many(a, b))
347}
348
349/// Repair the connection between two hosts, or sets of hosts, in one direction.
350///
351/// Must be called from within a Turmoil simulation.
352pub fn repair_oneway(from: impl ToIpAddrs, to: impl ToIpAddrs) {
353 World::current(|world| world.repair_oneway_many(from, to))
354}
355
356/// Return the number of established tcp streams on the current host.
357pub fn established_tcp_stream_count() -> usize {
358 World::current(|world| world.est_tcp_streams())
359}
360
361/// Return the number of established tcp streams on the given host.
362pub fn established_tcp_stream_count_on(addr: impl ToIpAddr) -> usize {
363 World::current(|world| world.est_tcp_streams_on(addr))
364}